One weakness of machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning (CL) paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We discuss this principle from a Bayesian perspective and show its connections to previous approaches to CL. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method in continual supervised learning and in continual reinforcement learning.
翻译:机器学习算法的一个弱点是模型在不忘以前获得的知识的情况下解决新问题的能力差。 连续学习范式已经形成,作为系统调查模式连续观察一系列任务产生的样本的设置的规程。 在这项工作中,我们对持续学习采取任务不可知的观点,并发展了分级信息理论最佳性原则,有利于在学习和忘却之间取舍。 我们从巴伊西亚的角度讨论这一原则,并表明它与以前与CL方法的联系。 根据这一原则,我们提议建立一个神经网络层,称为Mixture of Variation-Explerts 层,通过建立一套信息处理路径,在网络中减少被遗忘,而这种路径由一种归结语政策管理。由于基于通用的实用功能的一般提法,我们可以将这一最佳性原则应用于大量学习问题,包括监管学习、强化学习和基因化模型。我们展示了我们在持续监督学习和持续强化学习方面的竞争性表现。