This is a project to predict if the patient has diabetes. In this project, I used Pandas, Numpy, Flask, Flask cors and, Sklearn libraries.
I loaded the data as ‘data’ variable and started my EDA. I realized that there are many 0 values in blood pressure, skin thickness and insulin. There are no NaN values in the dataset and, all the values in the dataset are numerical.
I replaced the 0 values with mean values of the respective columns. Thereafter, I divided the dataset into dependent variable and independent variable.And split them both into train and test sets by the help of train_test_split from Sklearn library.
I fit the training set with Random Forest from Sklearn and tested the model with the testing set. I got 96% & 94% inaccuracy score and f1 score respectively. Thereafter, I made a pickle file of the model with the help of the pickle library.
I created an API with the help of Flask library to make use of my model to predict the patients from the website. I created a homepage route to take the details, created predict route to do the processing of the details given & predicting diabetes and, created a results route to show the predictions.
I uploaded all the files to Github and finally deployed the API to a cloud platform.
I did all this to practice the whole process of the data science project, and the following are all the things I learned while doing this project:
I learned how to implement the model with API.
I learned how to deploy the API to the cloud.
Project Github: Diabetes Predictor