A variety of real-world applications rely on far future information to make decisions, thus calling for efficient and accurate long sequence multivariate time series forecasting. While recent attention-based forecasting models show strong abilities in capturing long-term dependencies, they still suffer from two key limitations. First, canonical self attention has a quadratic complexity w.r.t. the input time series length, thus falling short in efficiency. Second, different variables' time series often have distinct temporal dynamics, which existing studies fail to capture, as they use the same model parameter space, e.g., projection matrices, for all variables' time series, thus falling short in accuracy. To ensure high efficiency and accuracy, we propose Triformer, a triangular, variable-specific attention. (i) Linear complexity: we introduce a novel patch attention with linear complexity. When stacking multiple layers of the patch attentions, a triangular structure is proposed such that the layer sizes shrink exponentially, thus maintaining linear complexity. (ii) Variable-specific parameters: we propose a light-weight method to enable distinct sets of model parameters for different variables' time series to enhance accuracy without compromising efficiency and memory usage. Strong empirical evidence on four datasets from multiple domains justifies our design choices, and it demonstrates that Triformer outperforms state-of-the-art methods w.r.t. both accuracy and efficiency. This is an extended version of "Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting", to appear in IJCAI 2022 [Cirstea et al., 2022a], including additional experimental results.
翻译:现实世界的各种应用依赖远远的未来信息来作出决定,因此要求对所有变量的时间序列进行高效和准确的长序多变时间序列预测。虽然最近基于关注的预测模型显示在捕捉长期依赖性方面有很强的能力,但它们仍受到两个关键限制。 首先, 卡通自我关注具有四倍的复杂性 w.r.t. 输入时间序列长度, 因而效率不高。 其次, 不同的变量的时间序列往往具有不同的时间动态, 现有研究无法捕捉, 因为它们使用相同的模型参数空间, 例如, 所有变量的时间序列的预测矩阵, 因而准确性不足。 为确保高效和准确性, 我们建议Trieren, 三角和可变特定关注度。 (i) 线性复杂性: 我们引入了具有线性复杂性的新型补丁性关注点, 并提议一个三角结构, 使层大小急剧缩小, 从而保持线性复杂性。 (ii) 变量参数: 我们建议采用一种轻度和量性的方法, 使不同变量时间序列的模型参数能够分立,, 从而提高精确性数值序列的精确性, 。 (i) irealtraal lax a lax lax lade date ladeal lade date a lade dal lade lades lax lax lax lax lax lax lax lax lax lax lax lax