5

I'm wanting to continue the training process for a model using new data.

I understand that you can continue training a Pytorch Lightning model e.g.

pl.Trainer(max_epochs=10, resume_from_checkpoint='./checkpoints/blahblah.ckpt') for example, if you last checkpoint is saved at epoch 5. But is there a way to continue training by adding different data?

2 Answers 2

11

To new users of Torch lightning, the new syntax looks something like this

trainer = pl.Trainer() trainer.fit(model,data,ckpt_path = "./path/to/checkpoint")

Also since I don't have enough reputation to comment, if you have already trained for 10 epoch and you want to train for 5 more epoch, add the following parameters to the Trainer

trainer = pl.Trainer(max_epochs = 15)

Sign up to request clarification or add additional context in comments.

Comments

3

Yes, when you resume from a checkpoint you can provide the new DataLoader or DataModule during the training and your training will resume from the last epoch with the new data.

trainer = pl.Trainer(max_epochs=10, resume_from_checkpoint='./checkpoints/blahblah.ckpt')

trainer.fit(model, new_train_dataloader)

2 Comments

if the model has been trained for 10 epochs previously and if I want to train for 5 more epoch should I keep max_epochs=5 or max_epochs=10? Reference: lightning.ai/forums/t/how-to-resume-training/432/8
How on earth can the DataModule during training be provided if we just load the model?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.