Cross validation is a technique to evaluate the performance of a model.
When dealing with the time series data like bitcoin, we evaluate the performance of a model by training on the past data and testing on the future.
For example, a model is trained on the 80% of the past and evaluated on the 20%. It's important to note that the test data is in the future of the training data, so that we can evaluate the performance of the model on the unseen future.
Cross validation extends the idea of data split, by repeatedly performing data split.
The down side of the data split is that we're performing an evaluation on a single data split. There is a chance that the model accidentally performs well (or worse) due to the pure luck.
Cross validation addresses this problem by performing training and evaluation over multiple data splits. In the below is a diagram showing the idea of cross validation. Since it's splitting the data 5 times, it's called 5-fold cross validation, or 5-CV for short.
In the first run, we use 16.6% (= 100% / (5 + 1)) of the data for training, and the immediately following 16.6% for testing. The rest of the data is unused. Likewise, in the second run, we use 33.2% (= 16.6% * 2) of the data for training, and the immediately following 16.6% for testing. We repeat this to run total of 5 times of training and evaluation. Final result is the average of the performance of those runs.
Autoregressive models, such as ARIMA, face challenges when applying the cross-validation method described above. If we simply split the data into training and testing sets and fit a model using the training set, the only way to make predictions on the test set is by autoregressively generating prices over the test data period. This approach results in zero utilization of the test data, as it is never actually used.
To address this issue, we redefine the test data to include both the training and test sets, meaning we always have the entire history of prices. When making a prediction for a specific day, we refit the ARIMA model using all available data up to that day, while still employing the hyperparameters learned from the training data, such as order and seasonal order. This approach allows ARIMA to utilize features (i.e., past price) from the test set.