Recent research efforts in lifelong learning propose to grow a mixture of models to adapt to an increasing number of tasks. The proposed methodology shows promising results in overcoming catastrophic forgetting. However, the theory behind these successful models is still not well understood. In this paper, we perform the theoretical analysis for lifelong learning models by deriving the risk bounds based on the discrepancy distance between the probabilistic representation of data generated by the model and that corresponding to the target dataset. Inspired by the theoretical analysis, we introduce a new lifelong learning approach, namely the Lifelong Infinite Mixture (LIMix) model, which can automatically expand its network architectures or choose an appropriate component to adapt its parameters for learning a new task, while preserving its previously learnt information. We propose to incorporate the knowledge by means of Dirichlet processes by using a gating mechanism which computes the dependence between the knowledge learnt previously and stored in each component, and a new set of data. Besides, we train a compact Student model which can accumulate cross-domain representations over time and make quick inferences. The code is available at https://github.com/dtuzi123/Lifelong-infinite-mixture-model.
翻译:终身学习的近期研究工作提议将各种模型混合起来,以适应越来越多的任务。拟议方法显示克服灾难性遗忘的有希望的结果。然而,这些成功模型背后的理论仍然没有得到很好理解。在本文件中,我们对终身学习模型进行理论分析,根据模型产生的数据和与目标数据集相对应的数据的概率性表述之间的距离差,得出风险界限。在理论分析的启发下,我们引入了一种新的终身学习方法,即终身无限混合(LIMix)模型,它可以自动扩展其网络结构,或者选择一个适当的组成部分来调整其参数,以学习新任务,同时保留其先前学到的信息。我们提议通过Drichlet进程将知识纳入其中,方法是使用一种配置机制,将以往所学知识与储存于每个组成部分的知识之间的依赖性与新数据集相匹配。此外,我们培训了一种紧凑的学生模型,可以随着时间的推移积累交叉表达,并作出快速推论。该代码可在 https://github.com/dtuzi123/listimite-lifinite.