1

I am making a class-incremental learning multi-label classifier. Here the model first trains with 7 labels. After training, another dataset emerges that contains the same labels except one more. I want to automatically add an extra node to the trained network and continue training on this new dataset. How can I do this?

class FeedForewardNN(nn.Module):
    def __init__(self, input_size, h1_size = 264, h2_size = 128, num_services=8):
        super().__init__()
        self.input_size = input_size
        self.lin1 = nn.Linear(input_size, h1_size)
        self.lin2 = nn.Linear(h1_size, h2_size)
        self.lin3 = nn.Linear(h2_size, num_services)
        self.relu = nn.ReLU()
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.lin1(x)
        x = self.relu(x)
        x = self.lin2(x)
        x = self.relu(x)
        x = self.lin3(x)
        x = self.sigmoid(x)
        return x

This is the architecture of the feedforward Neural Network. Then I first train on the data set with only 7 classes.

#Create NN
input_size = len(x_columns)
net1 = FeedForewardNN(input_size, num_services=7)
alpha= 0.001

#Define optimizer
optimizer = optim.Adam(net.parameters(), lr=alpha)
criterion = nn.BCELoss()
running_loss = 0

#Training Loop
loss_list = []
auc_list = []

for i in range(len(train_data_x)):
    optimizer.zero_grad()

    outputs = net1(train_data_x[i])
    loss = criterion(outputs, train_data_y[i])
    loss.backward()
    optimizer.step()

However then, I want to add one additional output node, define the new weights but maintain the old trained weights, and train on this new data set.

1 Answer 1

1

I suggest to replace layer with new one, having desired shape, and than partially assign its parameter values with old ones as follows:

def increaseClassifier( m: torch.nn.Linear ):
    w = m.weight
    b = m.bias
    old_shape = m.weight.shape

    m2 = nn.Linear( old_shape[1], old_shape[0] +1 )
    m2.weight = nn.parameter.Parameter( torch.cat( (m.weight, m2.weight[0:1]) ) )
    m2.bias = nn.parameter.Parameter( torch.cat( (m.bias, m2.bias[0:1]) ) )
    return m2

class FeedForewardNN(nn.Module):
    ...
    def incrHere(self):
        self.lin3 = increaseClassifier( self.lin3 )

UPD:

Can you explain, how these additional weights that come with this new output node are initialized?

The initial weights for new channel come from new layer creation, layer constructor make new parameters with some random initialization, then we are replace part of it with trained weight, and remained part is ready for new training.

m2.weight = nn.parameter.Parameter( torch.cat( (m.weight, m2.weight[0:1]) ) )

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, this works like a charm! Can you explain to me how these additional weights that come with this new output node are initialized? I don't see how this happens in the code.
ok, I'll edit answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.