We will first train a random model so that we can compare our other models and their performance and efficiency.

*How to perform log-loss for a random model in a multi-class setting?*We will randomly generate numbers equal to our number of classes(10 in our problem) for every point in our Test and Cross Validate data and then normalize them to sum it to one.

test_data_len = test_df.shape[0]

cv_data_len = cv_df.shape[0]# we create a output array that has exactly same size as the CV datacv_predicted_y = np.zeros((cv_data_len,9))

#for every value in our CV data we create a array of all zeros with #size 9for i in range(cv_data_len):#iterating to each value in cv data(row)

rand_probs = np.random.rand(1,9) #generating randoms form 1 to 9

cv_predicted_y[i] = ((rand_probs/sum(sum(rand_probs)))[0]) #normalizing to sum to 1print("Log loss on Cross Validation Data using Random Model",log_loss(y_cv,cv_predicted_y, eps=1e-15))# Test-Set error.

#we create a output array that has exactly same as the test datatest_predicted_y = np.zeros((test_data_len,9))

for i in range(test_data_len):

rand_probs = np.random.rand(1,9)

test_predicted_y[i] = ((rand_probs/sum(sum(rand_probs)))[0])

print("Log loss on Test Data using Random Model",log_loss(y_test,test_predicted_y, eps=1e-15))predicted_y =np.argmax(test_predicted_y, axis=1)

plot_confusion_matrix(y_test, predicted_y+1)

In the above we first created an empty array with size 9 for each class label and then randomly generated probabilities for each class label and plotted the confusion matrix and computed log-loss.

We can see that our random-model has a *log-loss of *** 2.4** across cross-validate and test-data so we need our models to perform better than this, letâ€™s check the

*precision and recall*for this model.

*How to interpret the above precision recall matrix?*

**Precision**

1. Taking an example of cell(1×1) it has value of **0.127** ; it says of all the points that are predicted to be *class 1* only **12.7%** values are actually *class 1*

2. For original *class 4* and *predicted* *class 2* we can say that of the values that our model predicted to *class 2*, **23.6% **values *actually* belong to *class 4*

**Recall**

1. Check cell (1X1) it has a value of 0.079 which means for all the points which actually belongs to class 1 our model predicted only 7% values to be class 1

2. For original *class 8* and* predicted class 5* values is **0.250** means of all the values which are actually *class 8* are model *predicted* **25%** values to be *class 5*

We will now be training our models after some exploratory data analysis and also feature encoding which you can check on my notebook. We trained multiple models and **Logistic Regression and Support Vector Machine** stands out from the rest.

*Logistic Regression*

*Support Vector Machine*

**Comparison of all the models**