1

I have a pre-trained sequential CNN model which I trained on images of 224x224x3. The following is the architecture:

model = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5, 5), strides = 1, activation = 'relu', input_shape = (224, 224, 3)))
model.add(MaxPool2D(pool_size = (3, 3)))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 128, kernel_size = (3, 3), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 256, kernel_size = (2, 2), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dense(128, activation = 'relu', use_bias=False))

model.add(Dense(num_classes, activation = 'softmax'))     

model.summary()

For reference, here is the model summary: model summary

I want to retrain this model on images of size 40x40x3. However, I am facing the following error: "ValueError: Input 0 of layer dense_12 is incompatible with the layer: expected axis -1 of input shape to have value 200704 but received input with shape (None, 256)". What should I do to resolve this error?

Note: I am using Tensorflow version 2.4.1

2
  • can you post the output of model.summary? Commented Apr 7, 2021 at 12:01
  • @BerkayBerabi I have included the model summary in the question. Please check. :) Commented Apr 7, 2021 at 12:48

1 Answer 1

0

The problem is, in your pre-trained model you have a flattened shape of 200704 as input shape (line no 4 from last), and then the output size is 128 for the dense layer (line 3 from the last). And now you wanna use the same pre-trained model for the image of 40X40, it will not work. The reasons are :

1- Your model is input image shape-dependent. it's not an end-to-end conv model, as you use dense layers in between, which makes the model input image size-dependent.

2- The flatten size of the 40x40 image after all the conv layers are 256, not 200704.

Solution

1- Either you change the flatten part with adaptive average pooling layer and then your last dense layer with softmax is fine. And again retrain your old model on 224x224 images. Following that you can train on your 40x40 images.

2- Or the easiest way is to just use a subset of your pre-trained model till the flatten part (exclude the flatten part) and then add a flatten part with dense layer and classification layer (layer with softmax). For this method you have to write a custom model, like here, just the first part will be the subset of the pre-trained model, and flatten and classification part will be additional. And then you can train the whole model over the new dataset. You can also take the benefit of transfer-learning using this method, by allowing the backward gradient to flow only through the newly created linear layer and not through the pre-trained layers.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. The 2nd solution solves the error. Also, is there any implementation for adaptive average pooling in keras.tensorflow or is it only in pytorch?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.