Training a pre-trained sequential model with different input shape

Question

I have a pre-trained sequential CNN model which I trained on images of 224x224x3. The following is the architecture:

model = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5, 5), strides = 1, activation = 'relu', input_shape = (224, 224, 3)))
model.add(MaxPool2D(pool_size = (3, 3)))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 128, kernel_size = (3, 3), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 256, kernel_size = (2, 2), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dense(128, activation = 'relu', use_bias=False))

model.add(Dense(num_classes, activation = 'softmax'))     

model.summary()

For reference, here is the model summary: model summary

I want to retrain this model on images of size 40x40x3. However, I am facing the following error: "ValueError: Input 0 of layer dense_12 is incompatible with the layer: expected axis -1 of input shape to have value 200704 but received input with shape (None, 256)". What should I do to resolve this error?

Note: I am using Tensorflow version 2.4.1

@BerkayBerabi I have included the model summary in the question. Please check. :) — Kunchanapalli Manohar
– Kunchanapalli Manohar, Commented Apr 7, 2021 at 12:48

Pankaj Mishra · Accepted Answer · 2021-04-07 13:11:09Z

0

The problem is, in your pre-trained model you have a flattened shape of 200704 as input shape (line no 4 from last), and then the output size is 128 for the dense layer (line 3 from the last). And now you wanna use the same pre-trained model for the image of 40X40, it will not work. The reasons are :

1- Your model is input image shape-dependent. it's not an end-to-end conv model, as you use dense layers in between, which makes the model input image size-dependent.

2- The flatten size of the 40x40 image after all the conv layers are 256, not 200704.

Solution

1- Either you change the flatten part with adaptive average pooling layer and then your last dense layer with softmax is fine. And again retrain your old model on 224x224 images. Following that you can train on your 40x40 images.

2- Or the easiest way is to just use a subset of your pre-trained model till the flatten part (exclude the flatten part) and then add a flatten part with dense layer and classification layer (layer with softmax). For this method you have to write a custom model, like here, just the first part will be the subset of the pre-trained model, and flatten and classification part will be additional. And then you can train the whole model over the new dataset. You can also take the benefit of transfer-learning using this method, by allowing the backward gradient to flow only through the newly created linear layer and not through the pre-trained layers.

answered Apr 7, 2021 at 13:11

Pankaj Mishra

4716 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kunchanapalli Manohar Over a year ago

Thank you. The 2nd solution solves the error. Also, is there any implementation for adaptive average pooling in keras.tensorflow or is it only in pytorch?

Pankaj Mishra Over a year ago

here is a link to that - tensorflow.org/addons/api_docs/python/tfa/layers/…

Collectives™ on Stack Overflow

Training a pre-trained sequential model with different input shape

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related