Using DataLoader for efficient model prediction

Question

I'm trying to understand the role/utility of batch_size in torch beyond model training. I already have a trained model, where the batch_size was optimized as a hyperparameter. I want to use the model to make predictions in new data. I'm following the same format of the implementation that was used for training for pre-processing my data. But instead of using the same batch_size that was optimized and looping through the different batches, I'm using batch_size equal to size of the data set (shuffle=False in this case), and passed the entire data set for prediction once.

I was wondering about the correctness of such approach, I don't have a lot of experience with torch, and couldn't find a lot of information about the most efficient ways of using the trained model to make prediction.

Here is a simplified version of the predict method I implemented in the model class, and it illustrates the use of DataLoader I'm referring to. I must say I noticed a significant speed up with this approach, over looping through the data.

def predict(self, X):
    X_loader = DataLoader(
        X,
        batch_size=X.shape[0],
        shuffle=False,
    )

    batch = next(iter(X_loader))
    with torch.no_grad():
        predictions = model(batch)

    return predictions

Thank you

EDIT

My question is whether or not it is correct to use the DataLoader in this way for making predictions, or I have to use the batch_size value optimized during the training process? In other words, using a different batch_size in prediction from what was used for training can affect the result?

My question is whether or not it is correct to use the DataLoader in this way for making predictions, or I have to use the batch_size value optimized during the training process? In other words, using a different batch_size in prediction from what was used for training can affect the result? My intuition says no, but I couldn't confirm that — pgaluzio
– pgaluzio, Commented Jul 18 at 9:04
When you say the batch size was optimised, I assume this was just to help with the learning rate, etc, and is not actually an input value that is used by the model to infer any learnable parameters? If that is the case, then the batch size you pass into the model shouldn't matter (so long as you have enough memory). You could verify this by passing an input with a batch size of 1, and then an input with a batch size of 2, but where each of the item is identical to the first input. If you get identical outputs for all items in the batch, then it shouldn't be an issue. — Matt Pitkin
– Matt Pitkin, Commented Jul 18 at 9:11
I ran a test here with two different batch sizes during inference, and in fact the results were identical. Thank you all for the help. — pgaluzio
– pgaluzio, Commented Jul 18 at 9:30
@ndrwnaguib I did so (totally forgot about that question in the meantime). — simon
– simon, Commented Aug 5 at 7:57

simon · Accepted Answer · 2025-08-06 14:53:43Z

Short answer: The batch size during inference (i.e. for making predictions on new samples with a trained model), under all reasonable circumstances, should be free to choose independent of the batch size during training.

Long answer

The role of the batches is different during model training and inference, so usually different batch sizes are applied in the two settings.

Batch size during training

During training, it is the main role of the batches to provide an estimate of the complete data distribution, so that the gradient, which is applied when updating the model parameters via backpropagation, is a good approximation of the true gradient, so that updating the model parameters with it indeed leads towards an optimum. As a consequence, the batch size during training is often chosen as large as possible (i.e. limited by the hardware that is used for model training). Other considerations and parameters, such as learning rate and optimization approach, also play a role there though, so several findings regarding the optimal choice of batch size have been published (see e.g. this and this paper – I am pretty sure that there are also more recent ones).

Batch size during inference

During inference, the only benefit of batching samples rather than processing them individually is increased parallelism (as the same calculations are applied to all samples) and thus, potentially, increased efficiency. The batch size itself is usually determined by other factors than during training: On the one hand, since no gradients have to be calculated any more, usually now larger batch sizes are possible if the same hardware that was used for training is still being used. On the other hand, inference often happens on less potent hardware (think e.g. of mobile devices) or with different latency requirements (think e.g. of real-time applications), so a smaller batch size might be chosen instead. What is important in either scenario: Once a trained model is applied, it should treat different samples independently, so the result for each individual sample should remain the same, no matter which and how many other samples are in its batch.

Of course, technically, everyone is free to choose to design and implement a model that behaves differently, but I would not know what would be the motivation for that (which is what I am referring to with under all reasonable circumstances in the short answer above). Crucially, what I do not mean here is the individual inputs to something like video models or language models where the sequence of input frames/tokens indeed plays a role: here, a "sample" is more than the individual video frame or language token – and the batch size should still not have an influence on the result during inference.

In any case, a sanity check would be: apply the same model with different batch sizes to the same samples for inference, and compare their outputs: if, for the same input, outputs are identical (within the limits of floating-point math), batch size should be free to choose.

Collectives™ on Stack Overflow

Using DataLoader for efficient model prediction

1 Answer 1

Long answer

Batch size during training

Batch size during inference

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Long answer

Batch size during training

Batch size during inference

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related