Machine Learning problems always deals with the datasets itself . Real world Data are need some feature Engineering techniques which may contain null values and categorical variables as well. This can arises due to human error or it can be from sensor itself. But we can also obtained such readymade datasets from Kaggle itself such as housing price prediction, sales prediction. For the brief intro, Readymade datasets are those which have no null values, need no type of feature selection techniques. The behaviour of each point in a dataset can be visualized by matplotlib pyplot function. Its important to visualize our datasets before approaching any machine learning task.

As per visualization, there are several algorithms discussed for regression model upon which model fits the best. The most common regression problem for any prediction is Linear Regression, Lasso Regression, Ridge Regression. They are applied depending upon the behaviour and the nature of datasets . Suppose someone wants to predict a prices of the house depending on the attributes that house possesses. It’s quite easy to fit regressional model on such scenerios.

There are several hyperparameters that often effects the performance of the model in the dataset and speeds up its accuracy level. Selecting a hyperparamter for a model that fits the best is quite often a tedious work. To avoid such type of complexities, the algorithms like GridSearchCV or RandomizedSearchCV have been introduced. Both works fine inspite of having few drawbacks.

As per the introduction, lets focus firstly on Linear Regression which is the one of the common application for continuous data such as housing price predictions. In Linear Regression, Line Curve is almost straight forward which is the best fitted option for the linear based datasets. Line Curve is the continuous line that have finite curvature. Line Curve can also be complex depending upon the datapoints on the datasheet. In almost all real life situations, line curve have non zero curvature(i.e. curved line)with best suited hyperparameters that fits well

Focusing on a fact that model performs well on a linearly spaced datasets, Line Curve follow the straight line equation (y = mx + c) where m is the slope of the straight line and c is the intercept. Datasets comprises input, features and the target variables. More you collected data samples, better will be your the model performance and its accuracy. The model trained on many features should be given importances according to its weights.

where h0, h1, h2,…. are the weights given to the features for predictions to a model to compute its accuracy. Below is a diagram illustrating Linear Regression graph which shows a linear correlation between features and the target variable.