Tuning Model using HPSklearn
Creating a Machine Learning model alone does not solve a problem unless you can optimize it for higher accuracy and better performance. It takes a lot of time to use GridsearchCV and RandomsearchCV to find out the best performing hyperparameters and using these techniques for different models is also a time-consuming process.
HPSklearn is an open-source python library that not only selects the best model for your data but also finds out the best parameters for that model. It is easy to use and contains a large variety of functionalities.
In this article, we will discuss how we can use HPsklearn to save time and effort in machine learning modeling.
Let’s get started….
!pip install hpsklearn
Next, we will import all the required libraries which will be used in this article.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from hpsklearn import HyperoptEstimator, extra_trees, svc
from hpsklearn import any_classifier
from hpsklearn import any_preprocessing
from hyperopt import tpe
from sklearn import datasets
import numpy as np
We will be using the famous IRIS dataset in this article.
# define dataset
X, y = make_classification(n_samples=100, n_features=10, n_informative=5, n_redundant=5, random_state=1)
Now we will use hpsklearn for creating the machine learning model.
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)# define search
model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), algo=tpe.suggest, max_evals=10, trial_timeout=30)
# perform the search
model.fit(X_train, y_train)
# summarize performance
acc = model.score(X_test, y_test)
print("Accuracy: %.3f" % acc)
# summarize the best model
print(model.best_model())
There are different classifiers available in this library. Some of them are svc, svc_linear, svc_rbf, svc_poly, svc_sigmoid, liblinear_svc, knn, ada_boost, gradient_boosting, random_forest, extra_trees, decision_tree, sgd, xgboost_classification etc.
estim = HyperoptEstimator(classifier=svc('SVM'),
preprocessing=any_preprocessing('pre'),
algo=tpe.suggest,
max_evals=10,
trial_timeout=300)
estim.fit( X_train, y_train )
Different preprocessing techniques that are available are PCA, one_hot_encoder, standard_scaler, min_max_scaler, normalized, ts_lagselector, tfidf, rbm, colkmeans.
acc = estim.score(X_test, y_test)
print("Accuracy: %.3f" % acc)
print(estim.best_model())
This is how you can use HPSklearn for automatic hyperparameter optimization. Go ahead try this and let me know your experiences in the response section.
This article is in collaboration with
Thanks for reading! If you want to get in touch with me, feel free to reach me on hmix13@gmail.com or my LinkedIn Profile. You can view my Github profile for different data science projects and packages tutorials. Also, feel free to explore my profile and read different articles I have written related to Data Science.