0

I am trying to do create CNN for regression purpose. Input is image data. For learning purpose , i have 10 image of shape (10,3,448,448), where 10 are images, 3 are channel and 448 are hieght and width.
Output lables are (10,245). Here is my architecture

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=5)
        self.conv2 = nn.Conv2d(32, 32, kernel_size=5)
        self.conv3 = nn.Conv2d(32,64, kernel_size=5)
        self.fc1 = nn.Linear(3*3*64, 256)
        self.fc2 = nn.Linear(256, 245)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        #x = F.dropout(x, p=0.5, training=self.training)
        x = F.relu(F.max_pool2d(self.conv2(x), 2))
        x = F.dropout(x, p=0.5, training=self.training)
        x = F.relu(F.max_pool2d(self.conv3(x),2))
        x = F.dropout(x, p=0.5, training=self.training)
        x = x.view(-1,3*3*64 )
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return x

cnn = CNN()
print(cnn)

it = iter(train_loader)
X_batch, y_batch = next(it)
print(cnn.forward(X_batch).shape)

Using batch size 2 i am expecting data shape produced by model is (2,245). But it is producing data of shape (2592, 245)

1 Answer 1

1

after self.conv3 you have tensors of shape [2, 64, 108, 108] which produces [2592, 576] after reshape. So this is where 2592 comes from. Change the lines: "self.fc1 = nn.Linear(3*3*64, 256)" and "x = x.view(-1,3*3*64)" so that they use proper image size after the layers.

below is the fixed code:

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=5)
        self.conv2 = nn.Conv2d(32, 32, kernel_size=5)
        self.conv3 = nn.Conv2d(32,64, kernel_size=5)
        self.fc1 = nn.Linear(108*108*64, 256)
        self.fc2 = nn.Linear(256, 245)

    def forward(self, x):
        print (x.shape)
        x = F.relu(self.conv1(x))
        print (x.shape)
        #x = F.dropout(x, p=0.5, training=self.training)
        x = F.relu(F.max_pool2d(self.conv2(x), 2))
        print (x.shape)
        x = F.dropout(x, p=0.5, training=self.training)
        print (x.shape)
        x = F.relu(F.max_pool2d(self.conv3(x),2))
        print (x.shape)
        x = F.dropout(x, p=0.5, training=self.training)
        print (x.shape)
        x = x.view(-1,108*108*64 )
        print (x.shape)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return x

cnn = CNN()
print(cnn)

# X_batch, y_batch = next(it)
print(cnn.forward(X_batch).shape)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.