不断演变的数据流适应性在线递增学习 (Adaptive Online Incremental Learning for Evolving Data Streams)

Recent years have witnessed growing interests in online incremental learning. However, there are three major challenges in this area. The first major difficulty is concept drift, that is, the probability distribution in the streaming data would change as the data arrives. The second major difficulty is catastrophic forgetting, that is, forgetting what we have learned before when learning new knowledge. The last one we often ignore is the learning of the latent representation. Only good latent representation can improve the prediction accuracy of the model. Our research builds on this observation and attempts to overcome these difficulties. To this end, we propose an Adaptive Online Incremental Learning for evolving data streams (AOIL). We use auto-encoder with the memory module, on the one hand, we obtained the latent features of the input, on the other hand, according to the reconstruction loss of the auto-encoder with memory module, we could successfully detect the existence of concept drift and trigger the update mechanism, adjust the model parameters in time. In addition, we divide features, which are derived from the activation of the hidden layers, into two parts, which are used to extract the common and private features respectively. By means of this approach, the model could learn the private features of the new coming instances, but do not forget what we have learned in the past (shared features), which reduces the occurrence of catastrophic forgetting. At the same time, to get the fusion feature vector we use the self-attention mechanism to effectively fuse the extracted features, which further improved the latent representation learning.

翻译：近些年来,人们对在线递增学习的兴趣日益浓厚。然而,这个领域的三大挑战是:概念漂移,即流数据中的概率分布会随着数据到达而变化。第二个主要困难是灾难性的忘记,即忘记我们在学习新知识之前学到的东西;最后一个我们经常忽略的是学习潜在代表性。只有良好的潜伏代表才能提高模型的预测准确性。我们的研究基于这一观察和克服这些困难的尝试。为此,我们建议为不断演变的数据流(AOIL)采用适应性在线递增学习。我们一方面使用存储模块的自动编码,一方面我们获得了投入的潜在特征,另一方面,我们根据记忆模块自动编码的重建损失,我们可以成功地发现概念漂移的存在并触发更新机制,同时调整模型参数。此外,我们把从隐藏层的激活中产生的功能分为两个部分,用来分别提取共同和私人特性(AOIL)。我们利用内存模块的自动编码,通过这一方法,我们获得了投入的潜在特征,根据记忆模块的重建,我们能够成功地发现概念的漂移和触发机制的存在,并调整模型的自我循环。此外,我们又学会了自我复制了自我循环的特性,我们又学会了过去。我们又学会了自我循环的特性,我们又学会了过去是如何去去去去去去去学习的自我学习的自我循环的特性。