The same output value whatever is the input value for a Pytorch LSTM regression model

Question

My dataset looks like the following:

on the left, my inputs, and on the right the outputs. The inputs are tokenized and converted to a list of indices, for instance, the molecule input: 'CC1(C)Oc2ccc(cc2C@HN3CCCC3=O)C#N' is converted to:

[28, 28, 53, 69, 28, 70, 40, 2, 54, 2, 2, 2, 69, 2, 2, 54, 67, 28, 73, 33, 68, 69, 67, 28, 73, 73, 33, 68, 53, 40, 70, 39, 55, 28, 28, 28, 28, 55, 62, 40, 70, 28, 63, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

I use the following list of chars as my map from strings to indices

cs = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z', 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z', '0','1','2','3','4','5','6','7','8','9', '=','#',':','+','-','[',']','(',')','/','\'
, '@','.','%']

Thus, for every char in the input string, there is an index, and if the length of the input string is less than the max length of all inputs which is 100, I complement with zeros. (like in the above-shown example)

My model looks like this:

class LSTM_regr(torch.nn.Module) :
    def __init__(self, vocab_size, embedding_dim, hidden_dim) :
        super().__init__()
        self.embeddings = nn.Embedding(vocab_size, embedding_dim, padding_idx=0)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True)
        self.linear = nn.Linear(hidden_dim, 1)
        self.dropout = nn.Dropout(0.2)  
        
    def forward(self, x, l):
        x = self.embeddings(x)
        x = self.dropout(x)
        lstm_out, (ht, ct) = self.lstm(x)
        return self.linear(ht[-1])
vocab_size = 76
model =  LSTM_regr(vocab_size, 20, 256)

My problem is, after training, every input I give to the model to test it, gives me the same output (i.e., 3.3318). Why is that?

My training loop:

def train_model_regr(model, epochs=10, lr=0.001):
    parameters = filter(lambda p: p.requires_grad, model.parameters())
    optimizer = torch.optim.Adam(parameters, lr=lr)
    for i in range(epochs):
        model.train()
        sum_loss = 0.0
        total = 0
        for x, y, l in train_dl:
            x = x.long()
            y = y.float()
            y_pred = model(x, l)
            optimizer.zero_grad()
            loss = F.mse_loss(y_pred, y.unsqueeze(-1))
            loss.backward()
            optimizer.step()
            sum_loss += loss.item()*y.shape[0]
            total += y.shape[0]

EDIT:

I figured it out, I reduced the learning rate from 0.01 to 0.0005 and reduced the batch size from 100 to 10 and it worked fine.

I think this makes sense, the model was training on large batch size, thus it was learning to output the mean always since that's what the loss function does.

Shai · Accepted Answer · 2021-11-14 17:06:16Z

2

Your LSTM_regr returns the last hidden state regardless of the true sequence length. That is, if your true sequence is of length 3, x is of length 100, and the output is the last hidden state after processing 97 padding elements.

You should compute the loss for the prediction that matches the true length of each sequence.

answered Nov 14, 2021 at 17:06

Shai

115k39 gold badges259 silver badges398 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

mac179 · Accepted Answer · 2021-11-16 06:47:08Z

2

I figured it out, I reduced the learning rate from 0.01 to 0.0005 and reduced the batch size from 100 to 10 and it worked fine.

I think this makes sense, the model was training on large batch size, thus it was learning to output the mean always since that's what the loss function does.

answered Nov 16, 2021 at 6:47

mac179

2,0703 gold badges20 silver badges26 bronze badges

Collectives™ on Stack Overflow

The same output value whatever is the input value for a Pytorch LSTM regression model

EDIT:

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

EDIT:

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related