We study the problem of fitting a model to a dynamical environment when new modes of behavior emerge sequentially. The learning model is aware when a new mode appears, but it does not have access to the true modes of individual training sequences. The state-of-the-art continual learning approaches cannot handle this setup, because parameter transfer suffers from catastrophic interference and episodic memory design requires the knowledge of the ground-truth modes of sequences. We devise a novel continual learning method that overcomes both limitations by maintaining a descriptor of the mode of an encountered sequence in a neural episodic memory. We employ a Dirichlet Process prior on the attention weights of the memory to foster efficient storage of the mode descriptors. Our method performs continual learning by transferring knowledge across tasks by retrieving the descriptors of similar modes of past tasks to the mode of a current sequence and feeding this descriptor into its transition kernel as control input. We observe the continual learning performance of our method to compare favorably to the mainstream parameter transfer approach.
翻译:我们研究了在新的行为模式逐渐出现时,将模型适应于动态环境的问题。学习模型知道一个新模式的出现,但它没有访问真实的个体训练序列模式的能力。最先进的持续学习方法无法处理这种情况,因为参数传输会遭受灾难性干扰,而情节记忆设计需要知道序列的基本事实模式。我们设计了一种新的持续学习方法,通过在神经情节记忆中维护所遇到序列模式的描述符来克服这两种限制。我们在记忆的注意权重上采用Dirichlet过程先验以促进模式描述的高效存储。我们的方法通过检索过去任务中的类似模式的描述符来执行任务间的知识传输,然后将此描述符作为控制输入馈送到当前序列的转换内核中。我们注意到我们方法的持续学习表现与主流的参数传输方法相比具有显著优势。