## Basics of machine learning: scikit-learn package

We will show here a very basic example of linear regression in the context of curve fitting. This toy example will allow us to illustrate key concepts such as linear models, overfitting, underfitting, regularization, and cross-validation.

We will generate a one-dimensional dataset with a simple model (including some noise), and we will try to fit a function to this data. With this function, we can predict values on new data points. This is a curve-fitting regression problem.

- First, let’s make all the necessary imports:

`import numpy as np`

import scipy.stats as st

import sklearn.linear_model as lm

import matplotlib.pyplot as plt

%matplotlib inline

2. We now define a deterministic nonlinear function underlying our generative model:

`f = lambda x: np.exp(3 * x)`

3. We generate the values along the curve on [0,2]:

`x_tr = np.linspace(0., 2, 200)`

y_tr = f(x_tr)

4. Now, let’s generate data points within [0,1]. We use the function f and we add some Gaussian noise:

`x = np.array([0, .1, .2, .5, .8, .9, 1])`

y = f(x) + np.random.randn(len(x))

5. Let’s plot our data points on [0,1]:

`plt.plot(x_tr[:100], y_tr[:100], '--k')`

plt.plot(x, y, 'ok', ms=10)