Lifelong learning aims to accumulate knowledge and alleviate catastrophic forgetting when learning tasks sequentially. However, existing lifelong language learning methods only focus on the supervised learning setting. Unlabeled data, which can be easily accessed in real-world scenarios, are underexplored. In this paper, we explore a novel setting, semi-supervised lifelong language learning (SSLL), where a model learns sequentially arriving language tasks with both labeled and unlabeled data. We propose an unlabeled data enhanced lifelong learner to explore SSLL. Specially, we dedicate task-specific modules to alleviate catastrophic forgetting and design two modules to exploit unlabeled data: (1) a virtual supervision enhanced task solver is constructed on a teacher-student framework to mine the underlying knowledge from unlabeled data; and (2) a backward augmented learner is built to encourage knowledge transfer from newly arrived unlabeled data to previous tasks. Experimental results on various language tasks demonstrate our model's effectiveness and superiority over competitive baselines under the new setting SSLL.
翻译:终身学习的目的是积累知识和减轻连续学习任务时的灾难性遗漏。 但是,现有的终身语言学习方法仅侧重于监管的学习环境。 未贴标签的数据在现实世界情景下很容易获得,但探索不足。 在本文中,我们探索了一种小说环境,即半监管的终身语言学习(SSLL),在这个环境中,模型用标签和未贴标签的数据学习按顺序产生的语言任务。我们建议用一个未贴标签的数据增强终身学习者来探索SSLL。 特别地,我们专门用特定任务模块来缓解灾难性的遗忘,并设计两个模块来利用未贴标签的数据:(1) 虚拟监管强化任务求解器建在一个教师-学生框架上,用未贴标签的数据来挖掘基本知识;(2) 建立落后的强化学习器,以鼓励从新到达的未贴标签数据向先前任务转移知识。关于各种语言任务的实验结果显示我们模式在新设置的SSLL下的效力和优于竞争性基线。