The analysis of long sequence data remains challenging in many real-world applications. We propose a novel architecture, ChunkFormer, that improves the existing Transformer framework to handle the challenges while dealing with long time series. Original Transformer-based models adopt an attention mechanism to discover global information along a sequence to leverage the contextual data. Long sequential data traps local information such as seasonality and fluctuations in short data sequences. In addition, the original Transformer consumes more resources by carrying the entire attention matrix during the training course. To overcome these challenges, ChunkFormer splits the long sequences into smaller sequence chunks for the attention calculation, progressively applying different chunk sizes in each stage. In this way, the proposed model gradually learns both local and global information without changing the total length of the input sequences. We have extensively tested the effectiveness of this new architecture on different business domains and have proved the advantage of such a model over the existing Transformer-based models.
翻译:长序列数据的分析在许多现实世界应用中仍然具有挑战性。 我们建议建立一个新的结构,即ChunkFormer, 来改进现有的变异器框架,以便在处理长时间序列时应对挑战。 原始变异器模型采用关注机制,按照一个序列发现全球信息,以利用上下文数据。 长顺序数据将本地信息,如季节性和波动,在短数据序列中捕获。 此外, 最初的变异器在培训课程中将整个关注矩阵消耗了更多的资源。 为了克服这些挑战, ChunkFormer将长序列分成小序列块进行关注计算, 在每个阶段逐步应用不同的块大小。 这样, 拟议的模型在不改变输入序列总长度的情况下逐渐学习本地和全球信息。 我们在不同的商业领域广泛测试了这种新结构的有效性,并证明这种模型对现有的变异器模型的优势。