LSTM

LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that can learn long-term dependencies in time series data. Unlike MLP, LSTM processes the data with dependencies between time steps. Therefore, it's likely to be more suitable for time series prediction than MLP.

LSTM architecture

We use stacked LSTM layers to capture long-term dependencies effectively and to discover features automatically. The hidden states from these LSTM layers are flattened and combined with the input data. Finally, a fully connected (FC) layer is applied to produce the prediction.

The concatenation of the input before the FC layer can be viewed as a form of Residual Network (ResNet). It allows the model to account for both the general patterns learned by the LSTM and the unique characteristics of each individual case.

LSTM for Air Passengers

To demonstrate model performance, we show the model's prediction results for the air passengers dataset. The cross validation process identified the best transformation to make the timeseries stationary and the optimal hyperparameters. The Root Mean Squared Error on the next day's closing price was used to determine the best model.

The chart below illustrates:

  1. train: training data
  2. prediction(train): the model's prediction for the training data periods
  3. test(input): test input data
  4. test(actual): test actual data
  5. prediction(test): the model's prediction on the selected days (1, 2, 7th days) of "test (actual)" periods given "test (input)".

LSTM model predicts the increases percentage of air passengers as described in capturing trends. Number of lstm layers and hidden dimensions were determined by the grid search.