With large quantities of data typically available nowadays, forecasting models that are trained across sets of time series, known as Global Forecasting Models (GFM), are regularly outperforming traditional univariate forecasting models that work on isolated series. As GFMs usually share the same set of parameters across all time series, they often have the problem of not being localised enough to a particular series, especially in situations where datasets are heterogeneous. We study how ensembling techniques can be used with generic GFMs and univariate models to solve this issue. Our work systematises and compares relevant current approaches, namely clustering series and training separate submodels per cluster, the so-called ensemble of specialists approach, and building heterogeneous ensembles of global and local models. We fill some gaps in the existing GFM localisation approaches, in particular by incorporating varied clustering techniques such as feature-based clustering, distance-based clustering and random clustering, and generalise them to use different underlying GFM model types. We then propose a new methodology of clustered ensembles where we train multiple GFMs on different clusters of series, obtained by changing the number of clusters and cluster seeds. Using Feed-forward Neural Networks, Recurrent Neural Networks, and Pooled Regression models as the underlying GFMs, in our evaluation on eight publicly available datasets, the proposed models are able to achieve significantly higher accuracy than baseline GFM models and univariate forecasting methods.
翻译:由于目前通常可以获得大量数据,因此在一系列时间序列(称为全球预测模型(GFM))中经过培训的预测模型往往优于在孤立序列中发挥作用的传统单向预测模型。由于GFMs通常在所有时间序列中共享相同的一组参数,因此它们往往有问题不能被定位到特定系列,特别是在数据集各不相同的情况下。我们研究如何使用通用GFMs和单向流模型组合技术解决这一问题。我们的工作系统化和比较相关当前方法,即集群系列和培训每个集群的单独子模型、所谓的专家组合办法以及建立全球和地方模型的混合组合。我们填补了现有的GFM本地化方法中的一些空白,特别是将基于特性的集群、基于远程的集群和随机组合等不同的组合技术纳入到不同的基础GFM模式中。我们随后提出了一个新的组合方法,即对不同系列的多类组合进行组合组合培训,即按不同组合的组合组合、所谓的专家组合组合组合,以及建立全球和地方模型的混合组合。我们通过大幅度改变现有的GFM系列和GFM系列数据基础网络,通过改变现有的GFM系列和GRFM系列和GFM系列,在现有的GFMFM系列和GFM系列中,在现有的数据库和GFMFMFMFMFMFMFMFSMFSM系列中可以实现现有数据库和GFMFMFMFM系列和GFM系列中可以取得新的数据组数组和GFFM系列和GFFMMMM系列和GFFMFMFM系列中,从而获得新的数据基数。