Time series forecasting is a crucial task in machine learning, as it has a wide range of applications including but not limited to forecasting electricity consumption, traffic, and air quality. Traditional forecasting models relied on rolling averages, vector auto-regression and auto-regressive integrated moving averages. On the other hand, deep learning and matrix factorization models have been recently proposed to tackle the same problem with more competitive performance. However, one major drawback of such models is that they tend to be overly complex in comparison to traditional techniques. In this paper, we try to answer whether these highly complex deep learning models are without alternative. We aim to enrich the pool of simple but powerful baselines by revisiting the gradient boosting regression trees for time series forecasting. Specifically, we reconfigure the way time series data is handled by Gradient Tree Boosting models in a windowed fashion that is similar to the deep learning models. For each training window, the target values are concatenated with external features, and then flattened to form one input instance for a multi-output gradient boosting regression tree model. We conducted a comparative study on nine datasets for eight state-of-the-art deep-learning models that were presented at top-level conferences in the last years. The results demonstrated that the proposed approach outperforms all of the state-of-the-art models.
翻译:在机器学习中,时间序列预测是一项关键的任务,因为它具有广泛的应用范围,包括但不限于预测电力消耗、交通和空气质量。传统预测模型依赖于滚动平均数、矢量自动递减和自动递减综合移动平均数。另一方面,最近提出了深层次学习和矩阵乘数化模型,以解决同样的问题,提高竞争力。然而,这些模型的一个主要缺点是,它们往往与传统技术相比过于复杂。在本文中,我们试图回答这些高度复杂的深层学习模型是否没有其他选择。我们的目标是通过重新审视梯度加速回归树进行时间序列预测来丰富简单而强大的基线库。具体地说,我们重新配置时间序列数据的方式,由梯度树推动模型以类似深层次学习模型的窗口处理。对于每个培训窗口来说,目标值往往与外部特征相融合,然后平坦地形成一个输入实例,用于多输出梯度加速回归树模型。我们试图通过重新审视梯度梯度加速回归树来丰富这些基线库。具体地,我们重新配置了时间序列预测。我们重新配置时间序列数据系列数据,我们用与深层次模型相似的窗口模式处理时间段。在上展示了所有最先进的模型。在上展示了所有最先进的模型。