Training and Evaluating the Network
Now comes the interesting part — the learning of the network, we have defined the architecture of the neural network, compiled it with a loss function, an optimizer for the updating weight, and a metric you need to keep track of.
For the learning of the network, we need to define when the training should end in this case; the number of epochs, we also define a batch_size to tell the machine after how many samples weight should be updated. We also input the training data samples and the training labels.
>>> model.fit(train_images, train_labels, epochs=10, batch_size=256)Epoch 1/10 235/235 [==============================] - 1s 3ms/step - loss: 0.5015 - accuracy: 0.8563
Epoch 2/10 235/235 [==============================] - 1s 3ms/step - loss: 0.1406 - accuracy: 0.9576
Epoch 3/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0895 - accuracy: 0.9737
Epoch 4/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0617 - accuracy: 0.9818
Epoch 5/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0467 - accuracy: 0.9863
Epoch 6/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0379 - accuracy: 0.9891
Epoch 7/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0293 - accuracy: 0.9919
Epoch 8/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0213 - accuracy: 0.9945
Epoch 9/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0160 - accuracy: 0.9958
Epoch 10/10 235/235 [==============================] - 1s 3ms/step - loss: 0.0132 - accuracy: 0.9965
In this case, the weights are updated after every 256 samples, and the training data is scanned in entirety 10 times, considering we have 60000 training samples in total, the weights are updated around 2350 times. We can see that accuracy of the model is around 99.65% for the training data.
To evaluate the performance for the test data we pass the test data and the test labels to the model.
>>> test_loss, test_acc = model.evaluate(test_images, test_labels)
>>> test_acc
0.9804999828338623
We can see that the test accuracy is lower compared to the training data, although the magnitude of the difference is not high here, if it was high; then it would be a case of overfitting the data.
We learned how to predict the handwritten digits in less than 20 lines of code.
Stay tuned for the next article in the series, we will be discussing data representation for neural networks, tensor operations with some code examples.
You can find the code for this article here :