“Learning to write programs stretches your mind, and helps you think better, creates a way of thinking about things that I think is helpful in all domains.”
— Bill Gates, Co-Chairman, Bill & Melinda Gates Foundation, Co-Founder, Microsoft
Let’s start with coding !
Matplotlib is used to plot graphs or images in python programming. It’s extension is numpy library which works on numerical mathematics. CV2 library is an extensive open source library used for detecing real world objects, identification of feature in images and many other image processing projects. Sklearn or scikit leran library is useful in machine learning oriented projects as it provides wide variety of approaches to adapt for the model such as linear regression, classification problems and dimensional reduction. Pandas is used to work mainly on dataframes whereas Numpy works on multi-dimensional array. Tensorflow is used to train neural network or machine learning models. It provides a faster way to train deep learning and neural network models.
This is the dataset file used in our code. O denotes the labels which has been converted to integer value
Loading the dataset using pandas library
OUTPUT: (Get a data insight !)
‘o’ in dataset contains labels, here; we are dropping ‘o’ from dataframe and storing the lables into y
Data is split here using train_test_split function imported from model selection into 28 X 28 dimension
word_dict is defined to store alphabets against each label in dataframe
Here; for loop has been used to iterate over labels in word_dict. Using this iteration number of image present in our dataset is counted and added to list count. Label values have been converted into integer and appended to count. List alphabets is used to append charcters from dictionary values. Both the count and alphabet list have been used to create box plot
The images are being reshaped so as to fit the CNN model
Here we convert the single float values to categorical values. This is done as the CNN model takes input of labels & generates the output as a vector of probabilities.
A lot is going on here. The CNN main layer is the Conv2D() layer. A illustration, such as the one given below, better illustrates the concept of convolution. A condensed 6×6 picture in green is seen in the diagram. On the top and bottom, and on the left and right, the picture is padded by a single 0-value row/column, seen in red.
A Filter is used by Convolution (sometimes called a kernel). There’s a 3×3 filter in the figure, seen in orange. The convolution filter values in a normal neural network are exactly the same as the weight values. You can see that, beginning from the top left, the filter overlays the padded image. The output product, shown in purple, is a matrix of 5×5 where, as shown, the values are determined. The filter is moved to the right by one pixel after the filter is applied; the shift distance is called the stride.
The thinking that underlie convolution are very deep. In short, the number of weights in a CNN is significantly decreased with the use of convolution, allowing preparation feasible for large pictures. In addition, convolution helps the model to accommodate mages that are moved up or down a few pixels. At the cost of raising the number of weights (filter values) and thus increasing the training time, more layers and a greater number of filters improve the predictive capacity of CNN.
CNN MODEL:
The code below gives us the output in form of model summary which shows the different layers of the CNN model network. In CNN the layers have two parameters that is weights and biases. Total parameter denotes sum of weights and biases.
The initial values of the weights and biases are also very susceptible to deep neural networks, so Keras has many different initialization functions that you can use.
Adadelta() (“adaptive delta”), one of the specialized variants of simple stochastic gradient descent, is the training optimizer object. RMSprop(), Adagrad(), and Adam() are acceptable alternatives, but SGD() usually does not fit well for CNN image classification. In our model we have used Adam().
15 parameters are recognised by the Keras Conv2D() function, but only two are required: filters (the number of filters) and kernel size. The strides default is (1,1), so it may have been skipped in the demo code. The parameter padding may be ‘valid’ or ‘same’, where the default value is ‘valid’. Using ‘same’ implies that Conv2D() will attempt to pad as closely as possible all around the picture.
CNN pooling is optional, although it is normally used. Via its input matrix, a 2×2 pooling layer scans, looking at each potential 2×2 cell collection. The four values are replaced by one single value in each 2×2 grid, the largest of the existing four values. Pooling decreases the number of parameters, thereby speeding up preparation. Pooling often smooths images together, which in terms of precision also contributes to a better model. Therefore, the program have MaxPooling2D() layer.
Ultimately, CNN is a classifier, so it requires a non-multidimensional final sheet. The Flatten() layer reshapes the existing matrix into a single dimension such that it is possible to add one or more layers of Dense() and use categorical cross entropy.
The following diagram shows a visualization of the architecture, with each layer fully connected to the surrounding layers:
The function fit() returns an object containing full logging information. For study of a model that refuses to train, this is often helpful. The method of training includes feeding the training dataset through the graph and maximizing the role of failure. Any time the network iterates over a batch of more training images, in order to more correctly predict the characters displayed, it updates the parameters to reduce the error. The testing process includes running our testing dataset through the qualified graph and keeping track of the number of correctly predicted images so that the precision can be measured.
The word “deep neural network” refers to the number of hidden layers, usually meaning “shallow” implies only one hidden layer, and “deep” refers to several hidden layers. A shallow neural network with a sufficient number of units, provided enough training data, should ideally be able to represent any role that a deep neural network might perform.
OUTPUT:
Accuracy scores for the CNN model including the train and test datasets.
Displaying some of the test images & their predicted labels.
OUTPUT:
You can encode and normalize data in a preprocessing step in order to perform CNN image classification, or you can do so programmatically. The Conv2D() layer expects an input with shape (width, height, channels) where channels = 1 for a grayscale image. Hyperparameters are the number of filters, kernel size, and strides, and their values must be determined by experimentation.
Optional, but common, is the use of one or more MaxPooling2D() and Dropout() layers. In order for you to use a cross entropy loss function, you must use a Flatten() layer before a final Dense() output layer. Adagrad, Adadelta, RMSprop, and Adam are all reasonable choices for training. Hyperparameters are the batch size and the maximum number of training epochs to use.
The training and test data used in this tutorial can be found here — :
For more in-depth knowledge about CNN, you can refer this amazing material. Good luck !
We hope you enjoyed this article !
Authors: Amisha Singh and Aditi Singh.
Code Link: The whole code for this article can be found in the below GitHub link.