
Everybody knows the importance of an artificial neural network in the machine learning context. Following this concept, the use of an artificial neural network to classification tasks is a significant activity to help people about do better decisions in the future. However, before of start, and implementation of the artificial neural network is necessary to have the knowledge in relation to her architecture.
The artificial neural network architecture varies from goal to goal because the architecture has many parameters, to be well realistic, the parameters that are used in architecture have direct convergence about the result waited that uses the artificial neural network to classified a MINST dataset and use the genetic algorithm to optimize the network architecture parameters. Following this scenery, below there are some parameters that are considered to arrive a significant result:
Activations Functions [“tanh”, “softmax”, “relu” or “sigmoid” ];
Depth [1,2,3,4,5,6,7,8,9 or 10];
Losses [“mean_squared_error”, ”mean_absolute_error”, ”mean_absolute_percentage_error”,”mean_squared_logarithmic_error”,”squared_hinge”,”hinge”,”categorical_hinge”,”logcosh”,”categorical_crossentropy”,”sparse_categorical_crossentropy”,”binary_crossentropy”,”kullback_leibler_divergence”,”poisson”,”cosine_proximity”];
Neurons per layer [16, 32, 64, 128, 512 or 1024];
Optimizer [“sgd”, “rmsprop”, “adagrad”, “adadelta”, “adam”, “adamax” or “nadam”]
Using these approaches, we have a new variable in format array, denominated $DNA_Parameter, in which it is initialized randomly with 5 positions. Figure 1 presents this array with your variable.
At this moment, we can do a new step, that is, new progress, and use the genetic algorithm to find better parameters to network architecture. Thus, it is necessary there is consciousness about some genetic algorithm approach. Figure 2 shown some approaches that are used to get to achieve what we want(Find the better parameters to artificial neural network architecture).
It is important mention that the parameter is used in this implementation were the following:
- Population Size = 4
- Mutation Rate = 2%
- Generations = 6
- Epochs = 1
- Batch Size = 1
- Verbose = 1
- Crossover Method = Midpoint
The parameters Batch Size and Verbose are about artificial neural network, and the remaining for genetic algorithms. Consequently, after a good understating of the genetic algorithm approach. You will look that during the running about implementation would be realized many mutations to find the better choice. Figure 3 illustrates this mutation commented.
And, after the mutation, new values are aggregate to parameters as Figure 4 show it.
This loop happens until the algorithm catches up the number of generations used and print the graphic about the accuracy rate and the number of generation, as it is shared below.
Each other line drew represents an artificial neural network architecture as it was showed in Figure 5, that is, the population_size variable represents it.
Therefore, it is possible to conclude that the parameters mentioned above, mainly batch size and the number of generation have totally convergence to achieve a significant result in every artificial neural network architecture parameter. Figure 5, for example, we had been the four networks with a precision of more than 80% of accuracy and after the third generation of algorithm already converged.
In the end, it was concluded that the use of an artificial neural network to classification and utilizing the genetic algorithm to optimize the network architecture is a good idea.
The repository is available here https://github.com/douglasamante/ga_to_optimize_ann.