Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.
翻译:在过去几十年里,统计人员广泛使用不线性时间序列模型(TAR)模型(TAR)进行非线性时间序列预测,这是因为其简单和数学性质。另一方面,在预测界,基于树的通用回归算法(森林、梯度加速)最近因其使用方便和准确性而变得受欢迎。在本文中,我们探索TAR模型和回归树之间的密切联系。这使我们能够利用关于TAR模型文献的丰富方法,将等级性TAR模型定义为一个倒影树模型,该模型可以在全球范围内进行跨系列的培训,我们称之为SETAR-TRee。与不主要侧重于预测和计算叶节点平均值的基于树的通用模型相比,我们引入一种新的预测性树型算法,用于在叶中培训全球集合回归模型(PR)模型,使模型能够学习跨系列信息,并使用某些特定时间序列的分解和停止程序。树的深度通过在SETAR模型中通用的统计线性基准来控制,我们称之为SETAR-T-TA(SET-TR-TR-TR-TRE-TRE-TRE)系统模型的高级测试标准,作为对比模型中通用的数值模型的对比模型,同时测量测算方法用于测量测测测测算。在测量每个树的每个树的估算中也要求进行一个最低的计算。我们提出的树值的估算,用于在树值的计算,在树值的估算的计算,在树的估算的计算中,在树值的估算中提供一种最低的计算。我们测算方法下,在计算方法下,在计算中,在计算中,在计算方法下,在计算。