Time Series prediction is a difficult problem both to frame and to address with machine learning. We will develop neural network models for time series prediction in Python using the Keras deep learning library.

## Problem Description

The problem we are going to look at in this post is the international airline passengers prediction problem. This is a problem where given a year and a month, the task is to predict the number of international airline passengers in units of 1,000. The data ranges from January 1949 to December 1960 or 12 years, with 144 observations.

Below is a sample of the first few lines of the file:

"Month","Passengers""1949-01",112"1949-02",118"1949-03",132"1949-04",129

We can load this dataset easily using the Pandas library. We are not interested in the date, given that each observation is separated by the same interval of one month. Therefore when we load the dataset we can exclude the first column. Once loaded we can easily plot the whole dataset. The code to load and plot the dataset is listed below:

import pandasimport matplotlib.pyplot as pltdataset = pandas.read_csv('airline-passengers.csv', usecols=[1], engine='python')plt.plot(dataset)plt.show()

which yields the following plot:

We can see an upward trend in the plot. We are going to keep things simple and work with the data as it is. Normally, it is a good idea to investigate various data preparation techniques to rescale the data and to make it stationary.

## Multilayer Perceptron Regression

We want to phrase the time series prediction problem as a regression problem. That is, given the number of passengers (in units of thousands) this month, what is the number of passengers next month.

We can write a simple function to convert our single column of data into a two-column dataset. The first column containing this monthâ€™s (t) passenger count and the second column containing next monthâ€™s (t+1) passenger count, to be predicted.

First, lets import all of the functions and classes we would need. This assumes a working SciPy environment with the Keras deep learning library installed:

import numpyimport matplotlib.pyplot as pltimport pandasfrom keras.models import Sequentialfrom keras.layers import Dense

as well as loading the dataset as a Pandas dataframe, extracting the NumPy array from the dataframe and converting the integer values to floating point values which are more suitable for modelling with a neural network.

# load the datasetdataframe = pandas.read_csv('airline-passengers.csv', usecols=[1], engine='python')dataset = dataframe.valuesdataset = dataset.astype('float32')

After we model our data and estimate the skill of our model on the training dataset, we need to get an idea of the skill of the model on new unseen data. For a normal classification or regression problem we would do this using cross validation.

With time series data, the sequence of values is important. A simple method that we can use is to split the ordered dataset into train and test datasets. The code below calculates the index of the split point and separates the data into the training datasets with 67% of the observations that we can use to train our model, leaving the remaining 33% for testing the model.

# split into train and test setstrain_size = int(len(dataset) * 0.67)test_size = len(dataset) - train_sizetrain, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]print(len(train), len(test))

Now we can define a function to create a new dataset . The function takes two arguments, the dataset which is a NumPy array that we want to convert into a dataset and the **look_back** which is the number of previous time steps to use as input variables to predict the next time period, in this case, defaulted to 1.

This default will create a dataset where X is the number of passengers at a given time (t) and Y is the number of passengers at the next time (t + 1):

# convert an array of values into a dataset matrixdef create_dataset(dataset, look_back=1):dataX, dataY = [], []for i in range(len(dataset)-look_back-1):a = dataset[i:(i+look_back), 0]dataX.append(a)dataY.append(dataset[i + look_back, 0])return numpy.array(dataX), numpy.array(dataY)

Lets take a look at the effect of this function on the first few rows of the dataset:

X Y112 118118 132132 129129 121121 135

If we compare these first 5 rows to the original dataset sample, we can see the X=t and Y=t+1 pattern in the numbers.

Lets use this function to prepare the train and test datasets ready for modelling:

# reshape into X=t and Y=t+1look_back = 1trainX, trainY = create_dataset(train, look_back)testX, testY = create_dataset(test, look_back)

We can now fit a Multilayer Perceptron model to the training data. We use a simple network with 1 input, 1 hidden layer with 8 neurons and an output layer. The model is fit using mean squared error, which if we take the square root gives us an error score in the units of the dataset:

# create and fit Multilayer Perceptron modelmodel = Sequential()model.add(Dense(8, input_dim=look_back, activation='relu'))model.add(Dense(1))model.compile(loss='mean_squared_error', optimizer='adam')model.fit(trainX, trainY, epochs=200, batch_size=2, verbose=2)

Once the model is fit, we can estimate the performance of the model on the train and test datasets. This will give us a point of comparison for new models.

# Estimate model performancetrainScore = model.evaluate(trainX, trainY, verbose=0)print('Train Score: %.2f MSE (%.2f RMSE)' % (trainScore, math.sqrt(trainScore)))testScore = model.evaluate(testX, testY, verbose=0)print('Test Score: %.2f MSE (%.2f RMSE)' % (testScore, math.sqrt(testScore)))

Finally, we can generate predictions using the model for both the train and test dataset to get a visual indication of the skill of the model.

Because of how the dataset was prepared, we must shift the predictions so that they align on the x-axis with the original dataset. Once prepared, the data is plotted, showing the original dataset in blue, the predictions for the train dataset in green the predictions on the unseen test dataset in red:

# generate predictions for trainingtrainPredict = model.predict(trainX)testPredict = model.predict(testX)# shift train predictions for plottingtrainPredictPlot = numpy.empty_like(dataset)trainPredictPlot[:, :] = numpy.nantrainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict# shift test predictions for plottingtestPredictPlot = numpy.empty_like(dataset)testPredictPlot[:, :] = numpy.nantestPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict# plot baseline and predictionsplt.plot(dataset)plt.plot(trainPredictPlot)plt.plot(testPredictPlot)plt.show()

yields

From the plot, we can see that the model did a pretty poor job of fitting both the training and the test datasets. It basically predicted the same input value as the output with

Train Score: 531.71 MSE (23.06 RMSE)Test Score: 2355.06 MSE (48.53 RMSE)

from which we can see that the model has an average error of 23 passengers (in thousands) on the training dataset and 48 passengers (in thousands) on the test dataset.