TensorFlow object detection api: classification weights initialization when changing number of classes at training using pre-trained models

Question

I want to utilize not only the feature-extractor pre-trained weights but also the feature-map layers' classifier/localization pre-trained weights for fine-tuning tensorflow object detection models (SSD) using tensorflow object detection API. When my new model has a different number of classes from the pre-trained model that I'm using for the fine-tuning checkpoint, how would the TensorFlow object detection API handle the classification weight tensors?

When fine-tuning pre-trained models in ML object detection models like SSD, I can initialize not only the feature-extractor weights with the pre-trained weights but also initialize the feature-map's localization layer weights and classification layer weights, with latter only choosing the pre-trained class weights of choice, so that I can decrease the number of classes that the model can initially identify (for example from 90 MSCOCO classes to whichever classes of choice within those 90 classes like cars & pedestrian only etc.)
https://github.com/pierluigiferrari/ssd_keras/blob/master/weight_sampling_tutorial.ipynb
This is how it's done in keras models (ie in h5 files) and I want to do the same in Tensorflow object detection API as well. It seems that at training time I can specify the number of classes the new model is going to have in the config protobuf file, but since I'm new to the API (and tensorflow) I haven't been able to follow the source structure and understand how that number is going to be handled at fine-tuning. Most SSD models I know just ignore and initialize the classification weight tensor in case the pre-trained model's class weight shape is different from the new model's classification weight shape, but I want to retain the necessary classification weights and train upon those. Also, how would I do that within the API structure?
Thanks!

Kazuya Hatta · Accepted Answer · 2018-03-20 07:50:30Z

1

As I read through the code I found the responsible code, which only retains the pre-trained model's weights if the shape of the layers between the newly-defined model and the pre-trained model match. So if I change the number of the class, the shape of the classifier layers change, and the pre-trained weights are not retained.

https://github.com/tensorflow/models/blob/master/research/object_detection/utils/variables_helper.py#L133

answered Mar 20, 2018 at 7:50

Kazuya Hatta

7501 gold badge9 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

WillSapgreen Over a year ago

Hello @Kazuya, does "the pre-trained weights are not retained" also include the feature-extractor pre-trained weight? Thank you so much for asking this question because it bothers me too!!!

Kazuya Hatta Over a year ago

Hi @willSapgreen, no it does not include the feature-extractor pre-trained weights. It seems that when the class numbers change, only the classification layers' weights are initialized, while the feature-extractor and localization layers' weights are retained because their shape does not change according to the number of classes.

WillSapgreen Over a year ago

Hello @Kazuya Hatta, thank you for the quick response. Another question I have is are both classification layers' and feature-extractors/localization layers' weights updated during the training? This is why I ask this following question: ai.stackexchange.com/questions/6129/…

Kazuya Hatta Over a year ago

@willSapgreen yes I think they are updated. Regarding your question in ai.stackexchange did you properly set the class number to 1, and made the tfrecord file correctly? If everything is done according to the documents and getting unreasonable results the other possibility is that your data is not properly annotated

WillSapgreen Over a year ago

thank you for the feedback. I will double check the tfrecord( extract the image data from it). Another question is in your experience, any other approach to debug the bag performance( like check the weights in classification layers ) besides looking at "total loss" and "mAP"? Thank you.

|

Collectives™ on Stack Overflow

TensorFlow object detection api: classification weights initialization when changing number of classes at training using pre-trained models

1 Answer 1

10 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related