Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available. While prior research on continual learning in automatic speech recognition has focused on the adaptation of models across multiple different speech recognition tasks, in this paper we propose an experimental setting for \textit{online continual learning} for automatic speech recognition of a single task. Specifically focusing on the case where additional training data for the same task becomes available incrementally over time, we demonstrate the effectiveness of performing incremental model updates to end-to-end speech recognition models with an online Gradient Episodic Memory (GEM) method. Moreover, we show that with online continual learning and a selective sampling strategy, we can maintain an accuracy that is similar to retraining a model from scratch while requiring significantly lower computation costs. We have also verified our method with self-supervised learning (SSL) features.
翻译:持续学习,又称终身学习,目的是不断从可获得的新数据中不断学习。虽然先前关于自动语音识别中持续学习的研究侧重于在多种不同语音识别任务中调整模型,但在本文中,我们提议为自动语音识别单一任务建立一个实验性设置 :\ textit{在线持续学习} 用于自动语音识别。 具体地说,当同一任务的额外培训数据随着时间的推移逐渐获得时,我们展示了对终端至终端语音识别模型进行渐进式更新的有效性,并采用了在线渐进式 Episodic Memory (GEM) 方法。 此外,我们显示,通过在线持续学习和选择性抽样战略,我们可以保持一种类似于从零开始再培训模型的准确性,同时要求大大降低计算成本。我们还用自我监控的学习(SSL)功能验证了我们的方法。