TS Mixer (Time-Series Mixer), developed by Google Cloud AI, is an architecture that extensively utilizes MLPs on time series data across both features and time steps.
Figure 1 in the TS Mixer paper effectively illustrates the architecture of TS Mixer. In summary, the model comprises the following components:
The paper notes that feature mixing can alter the data's dimensions, and in such cases, a fully connected layer over the residual is used to match the dimensions. However, we omit these cases during hyperparameter optimization to reduce the search space.
Instead of separately mixing features as depicted in Figure 4 of the paper, we input all the generated features since our features exist for all time steps. There is no needs for alignment between historical prices and features.
Also, while applying data scaling as preprocessing, no local normalization was applied. Our goal is not for benchmarking but for practical use and we speculate that batch norm 2d would be sufficient.
TS Mixer is efficient in that it incorporates feature and time mixing operations along with residual connections. It has outperformed transformer models such as FEDformer, Autoformer, and Informer in the experiments presented in the paper. Moreover, it is faster to train and predict compared to slower deep learning models like LSTM.
To demonstrate model performance, we show the model's prediction results for the air passengers dataset. The cross validation process identified the best transformation to make the time series stationary and the optimal hyperparameters. The Root Mean Squared Error on the next day's closing price was used to determine the best model.
In the chart, we display the model's predictions for last split of cross validation and test data.