HYPERPARAMETER TUNING IN MACHINE LEARNING

Every Machine Learning model can be defined as a mathematical model with a number of parameters. The value of these parameters affect the training and thereby the accuracy of the model.

The term “tuning” refers to the adjusting of the parameter value. The goal here is to provide the optimum value for each of the model’s parameter .

In the onset of ML models, the tuning was done manually by a team of experts. This required a lot of effort and expertise. Only a person who was well-versed in this domain could do the task with the utmost perfection.

This problem was solved with the arrival of Hyperparameter tuning.

Hyperparameter tuning can be defined as the process of choosing a set of optimal hyperparameters for a learning algorithm.

A hyperparameter is technically a parameter whose value is used to control the learning process.

Hyperparameter Tuning is an automated process by which we can attain the maximum possible accuracy of the classifier prediction. In other words, it is employed to improve the classifier’s output.

As stated above, every model architecture is defined by a set of parameters. We are basically picking the optimal parameter values for the classifier before the training section. It is a fundamental step which can play a big role in improving the overall precision of the model.

There are multiple ways to achieve hyperparameter tuning in Machine Learning. These include :

Grid Search
Random Search
Bayesian Optimization

GRID SEARCH

Credit : https://gifer.com/en/9w8l

Grid search is a conventional hyperparameter tuning method. It is quite basic compared to the other methods. It is a methodical approach where different combinations of value are taken for each parameter. Then the accuracy is tested with each combination. The combination which offers the best results are then subsequently considered as the final parameter values.

For example : Consider a model having 3 hyperparameters x, y and z

Let the values that each parameter can accept be denoted by the 3 arrays :

x = [0 , 0.25 ,0.5 , 0.75]

y = [0.3 , 0.6 , 0.9]

z = [0.2 , 0.4 , 0.6 , 0.8]

Then the first combination will be (0, 0.3, 0.2), another combination will be (0 , 0.3 , 0.4) and so on. Finally, the combination with the best results are taken.

RANDOM SEARCH

Credit : https://davidbaptistechirot.blogspot.com/2018/06/random-number-generator-gif.html

The concept of Random search differs from that of Grid search. As stated above, Grid searches tests different combinations of values from a grid and picks the best one. However, in the case of Random search, random combinations of values are taken and sampled in each iteration.

We can specify the number of iterations for which Random search should occur. This can be specified by the parameter “n_iter”

This method requires less time and computational power compared to Grid search. Hence, this is a good method to achieve hyperparameter tuning.

BAYESIAN OPTIMIZATION

Credit : https://gfycat.com/blondillustrioushorsechestnutleafminer

Bayesian Optimization is a probabilistic approach to hyperparameter tuning. The technique adopted by Bayesian Optimization is based on Bayes Theorem

In this method, the results of the past evaluation is taken into account. Based on these past results, a probabilistic model is formed. This model is called as a surrogate model.

In each iteration, the surrogate model is subsequently updated. This is done for the specified number of iterations. Thereby, the optimum hyperparameter values are obtained at the end of the final iteration.

CONCLUSION

Hyperparameter Tuning is indeed a fundamental method to improve the model. The optimum value for each of the hyperparameter will improve the quality of the training. This can play a huge role in improving the overall precision, or in other words, deliver the best possible output prediction.

REFERENCES

Footer