Model Understanding >

NLinear

NLinear is a simple neural network model that uses a linear layer to predict the next value in the sequence.

NLinear Architecture

Are Transformers Effective for Time Series Forecasting? (github) explored the transformer's effectiveness for time series forecasting. While doing so they proposed two simple models: NLinear and DLinear. They found that these two models are competitive with the state-of-the-art transformer-based models.

NLinear is simply works as follows:

  1. Subtract the last value of the sequence from the entire sequence.
  2. Then it goes through a linear layer.
  3. Finally, it adds the last value of the sequence to the output of the linear layer.

The limitation of NLinear, however, is that it can only handle univariate time series. Therefore, we made a slight changes to make it consume exogenous features as follows:

  1. Subtract the last value of the sequence from the entire sequence.
  2. Flatten along the time step dimension and feature dimension.
  3. Then it goes through a linear layers.
  4. Finally, it adds the last value of the target (i.e., closing stock price) to the output.
Also, since target to predict is percentage increases than the raw value as explained in capturing trends, we test both percentage increases and raw values as input data.

NLinear for Air Passengers

To demonstrate model performance, we show the model's prediction results for the air passengers dataset. The cross validation process identified the best transformation to make the time series stationary and the optimal hyperparameters. The Root Mean Squared Error on the next day's closing price was used to determine the best model.

In the chart, we display the model's predictions for last split of cross validation and test data.

  1. train: Training data of the last split.
  2. validation: Validation data of the last split.
  3. prediction (train, validation): Prdiction for train and validation data period. For each row (or a sliding window) of data, predictions are made for n days into the future (where n is set to 1, 2, 7). The predictions are then combined into a single series of dots. Since the accuracy of predictions decreases for large n, we see some hiccups in the predictions. The predictions from the tail of the train spills into the validation period as that's future from the 'train' data period viewpoint. These are somewhat peculiar settings, but it works well in testing if the model's predictions are good enough.
  4. test(input): Test input data.
  5. test(actual): Test actual data.
  6. prediction(test): The model's prediction given the test input. There's only one prediction from the last row (or the last sliding window) of the test input which corresponds to 1, 2, 7 days later after 'test(input)'.