Certain models are specifically crafted to process timeseries data, such as ARIMA and ETS. Conversely, some models struggle with trends, like the Decision Tree.
Instead of predicting absolute values that were not observed in the training data, we can let the model to forecast relative changes, i.e, $({y_{t+h}} - y_t)/y_t$ rather than $y_{t+h}$. For example, if training data had 100, 200, then its increases is 100%. A model that learned to predict 100% increase can output 400 as the future value even though such a range of numbers were never observed during training.
For input data, we apply transformations as explained in stationary timeseries. By utilizing differenced historical values, the model focuses on learning from the changes of featuers than their absolute values. When diff is combined with log-transformation, it effectively captures relative changes in the features since $\log(x_{t+h}) - \log(x_t) = \log(x_{t+h}/x_t)$. However, whether those will be actually used will depend on grid search.