Transformer-based methods have shown great potential in long-term time series forecasting. However, most of these methods adopt the standard point-wise self-attention mechanism, which not only becomes intractable for long-term forecasting since its complexity increases quadratically with the length of time series, but also cannot explicitly capture the predictive dependencies from contexts since the corresponding key and value are transformed from the same point. This paper proposes a predictive Transformer-based model called {\em Preformer}. Preformer introduces a novel efficient {\em Multi-Scale Segment-Correlation} mechanism that divides time series into segments and utilizes segment-wise correlation-based attention for encoding time series. A multi-scale structure is developed to aggregate dependencies at different temporal scales and facilitate the selection of segment length. Preformer further designs a predictive paradigm for decoding, where the key and value come from two successive segments rather than the same segment. In this way, if a key segment has a high correlation score with the query segment, its successive segment contributes more to the prediction of the query segment. Extensive experiments demonstrate that our Preformer outperforms other Transformer-based methods.
翻译:以变换器为基础的方法在长期时间序列预测中显示出巨大的潜力。然而,大多数这些方法都采用了标准的点对点自我注意机制,它不仅由于复杂程度随着时间序列时间序列的长度而增加四边形,对长期预测变得难以解决,而且由于相应的关键值和价值从同一点转变,因此无法明确从背景中捕捉预测性依赖性。本文提出了一个以预测性变换器为基础的模型,称为“前导”。先导引入了一个新型的高效的 prem-me-sublium-coirl}机制,将时间序列分成几个部分,并利用分对相对相关注意的编码时间序列。一个多尺度的结构在不同的时间尺度上形成综合依赖性,便于选择部分长度。前导还设计了一个解码的预测性模式,即关键部分和价值来自两个连续部分,而不是同一部分。在这样的情况下,如果一个关键部分与查询部分的得分得分高,其连续部分会更有助于对查询段的预测。广泛的实验表明,我们的前导前导队超越了其他变换方法。