Daily Data Science Tip #10

Why do we batch the dataset before training?

As a Machine Learning practitioner, you’ve probably wondered why is it a standard processing tp batch training data before feeding it to a neural network?

A straightforward answer is that training data or data used within neural networks are batched mainly for memory optimisation purposes. Placing a whole dataset, for example, all 60,000 of the MNIST training dataset, in a GPU’s memory is very expensive. You would probably run into the infamous “RuntimeError: CUDA error: out of memory”.

To avoid memory issues when training a neural network, large datasets are batched in sets of 16, 32, or 128. The batch number depends on the compute resource’s memory capacity.

Why do we batch the dataset before training?

Footer