GBM (Gradient Boosting Machine) is an ensemble of weak learners like decision trees. By adding trees one by one, while correcting the errors of the previous, it achieves high performance.
GBM is an exceptionally effective model for both tabular data prediction and time series forecasting. Despite the advent of numerous deep learning models, GBM remains a top performer. The Kaggle 2023 AI report highlights this, noting the "continuing dominance of gradient boosted trees" and that "tabular data ... remains largely unaffected by the deep learning revolution".
Since GBM works with tabular data, it is essential to include lagged values, value ratios, and other derived features in each row. This ensures that each row contains both past and present values, enabling GBM training.
We use LightGBM, LGBMRegressor to be more specific. 'gbdt' (Gradient Boosting Decision Tree), early stopping, regression, and rmse were used. Other parameters such as max_depth, min_data_in_leaf, n_estimators, colsample_bytreem, etc. were tuned by grid search.
To demonstrate model performance, we show the model's prediction results for the air passengers dataset. The cross validation process identified the best transformation to make the timeseries stationary and the optimal hyperparameters. The Root Mean Squared Error on the next day's closing price was used to determine the best model.
The chart below illustrates: