Amounts of historical data collected increase together with business intelligence applicability and demands for automatic forecasting of time series. While no single time series modeling method is universal to all types of dynamics, forecasting using ensemble of several methods is often seen as a compromise. Instead of fixing ensemble diversity and size we propose to adaptively predict these aspects using meta-learning. Meta-learning here considers two separate random forest regression models, built on 390 time series features, to rank 22 univariate forecasting methods and to recommend ensemble size. Forecasting ensemble is consequently formed from methods ranked as the best and forecasts are pooled using either simple or weighted average (with weight corresponding to reciprocal rank). Proposed approach was tested on 12561 micro-economic time series (expanded to 38633 for various forecasting horizons) of M4 competition where meta-learning outperformed Theta and Comb benchmarks by relative forecasting errors for all data types and horizons. Best overall results were achieved by weighted pooling with symmetric mean absolute percentage error of 9.21% versus 11.05% obtained using Theta method.
翻译:所收集的历史数据数量随着商业情报的适用性和对时间序列的自动预测要求的增加而增加。 虽然没有单一的时间序列模型方法对所有类型的动态都是普遍的,但使用多种方法的组合式预测往往被视为一种折中。我们提议用元学习来适应性预测这些方面,而不是确定共同的多样性和大小。元学习在这里考虑两个独立的随机森林回归模型,以390个时间序列特征为基础,排在22个单项预测方法和建议共同体积。因此,根据最佳和预测采用简单或加权平均(重量对应等同等级)组合起来的方法,对共同体进行了预测。提议的方法是以M4竞争的12561个微观经济时间序列(在各种预测地平线上已发展到38633个)进行测试,在M4竞争中,元学习通过所有数据类型和地平线的相对预测错误,超越了Theta和Comb基准。通过加权汇集,实现了最佳的总体结果是用对称平均百分比误差9.21%和用Theta方法获得的11.05 %。