We study the problem of fitting a model to a dynamical environment when new modes of behavior emerge sequentially. The learning model is aware when a new mode appears, but it does not have access to the true modes of individual training sequences. We devise a novel continual learning method that maintains a descriptor of the mode of an encountered sequence in a neural episodic memory. We employ a Dirichlet Process prior on the attention weights of the memory to foster efficient storage of the mode descriptors. Our method performs continual learning by transferring knowledge across tasks by retrieving the descriptors of similar modes of past tasks to the mode of a current sequence and feeding this descriptor into its transition kernel as control input. We observe the continual learning performance of our method to compare favorably to the mainstream parameter transfer approach.
翻译:当新的行为模式陆续出现时,我们研究将模型适应动态环境的问题。当新的行为模式出现时,学习模式是知道的,但是它无法接触到单个培训序列的真实模式。我们设计了一种新的持续学习方法,在神经硬质记忆中保持对一个相遇序列模式的描述。我们在记忆的注意权重之前采用了一个分解进程,以促进对模式描述符的有效储存。我们的方法通过将过去任务类似模式的描述符重新恢复到当前序列模式,并将这一描述符输入其过渡核心作为控制投入,从而通过将不同任务的知识转移到当前序列模式,并将这一描述符输入到其过渡核心中,从而持续学习。我们观察了我们不断学习方法的绩效,以便与主流参数传输方法进行比较。