Domain classification is the fundamental task in natural language understanding (NLU), which often requires fast accommodation to new emerging domains. This constraint makes it impossible to retrain all previous domains, even if they are accessible to the new model. Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different. In fact, the key real-world problem is not the absence of old data, but the inefficiency to retrain the model with the whole old dataset. Is it potential to utilize some old data to yield high accuracy and maintain stable performance, while at the same time, without introducing extra hyperparameters? In this paper, we proposed a hyperparameter-free continual learning model for text data that can stably produce high performance under various environments. Specifically, we utilize Fisher information to select exemplars that can "record" key information of the original model. Also, a novel scheme called dynamical weight consolidation is proposed to enable hyperparameter-free learning during the retrain process. Extensive experiments demonstrate that baselines suffer from fluctuated performance and therefore useless in practice. On the contrary, our proposed model CCFI significantly and consistently outperforms the best state-of-the-art method by up to 20% in average accuracy, and each component of CCFI contributes effectively to overall performance.
翻译:自然语言理解(NLU)通常要求快速适应新的新兴领域,而自然语言理解(NLU)往往需要快速适应新出现领域。这一制约使得即使新模式可以使用,也不可能重新培训所有以前的领域。大多数现有的持续学习方法都存在低精度和性能波动的问题,特别是在旧数据和新数据的分布差异很大的情况下。事实上,关键的现实世界问题不是缺乏旧数据,而是对整个旧数据集的模型进行再培训的效率低下。利用一些旧数据来产生高精确度和保持稳定性能的潜力,而与此同时,不引入额外的超分计,则无法重新培训所有以前的领域。在本文件中,我们为文本数据提出了一个无超分度的持续学习模式,在各种环境中可以令人难以捉摸地产生高性能。具体地说,我们利用渔业者信息来选择能够“记录”原始模型的关键信息的Explaers。此外,还提出了一个称为动态重量整合的新计划,以便在再培训过程中实现超分量学习。广泛的实验表明基线会因波动性能波动,因此在实际中是无效的。相反,我们提出的每个平均格式都有助于制定最佳业绩模型。