I'm trying to understand the role/utility of batch_size in torch beyond model training. I already have a trained model, where the batch_size was optimized as a hyperparameter. I want to use the model to make predictions in new data. I'm following the same format of the implementation that was used for training for pre-processing my data. But instead of using the same batch_size that was optimized and looping through the different batches, I'm using batch_size equal to size of the data set (shuffle=False in this case), and passed the entire data set for prediction once.
I was wondering about the correctness of such approach, I don't have a lot of experience with torch, and couldn't find a lot of information about the most efficient ways of using the trained model to make prediction.
Here is a simplified version of the predict method I implemented in the model class, and it illustrates the use of DataLoader I'm referring to. I must say I noticed a significant speed up with this approach, over looping through the data.
def predict(self, X):
X_loader = DataLoader(
X,
batch_size=X.shape[0],
shuffle=False,
)
batch = next(iter(X_loader))
with torch.no_grad():
predictions = model(batch)
return predictions
Thank you
EDIT
My question is whether or not it is correct to use the DataLoader in this way for making predictions, or I have to use the batch_size value optimized during the training process? In other words, using a different batch_size in prediction from what was used for training can affect the result?