In this paper, we propose a method for incremental learning of two distinct tasks over time: acoustic scene classification (ASC) and audio tagging (AT). We use a simple convolutional neural network (CNN) model as an incremental learner to solve the tasks. Generally, incremental learning methods catastrophically forget the previous task when sequentially trained on a new task. To alleviate this problem, we use independent learning and knowledge distillation (KD) between the timesteps in learning. Experiments are performed on TUT 2016/2017 dataset, containing 4 acoustic scene classes and 25 sound event classes. The proposed incremental learner solves the AT task with an F1 score of 54.4% and the ASC task with an accuracy of 88.9% in an incremental time step, outperforming a multi-task system which solves ASC and AT at the same time. The ASC task performance degrades only by 5.1% from the initial time ASC accuracy of 94.0%.
翻译:在本文中,我们提出了一个逐步学习两项不同任务的方法:声学场景分类(ASC)和音频标记(AT)。我们使用简单的进化神经网络(CNN)模型作为渐进式学习者来完成任务。一般来说,渐进式学习方法灾难性地忘记了上一个任务,在按顺序进行新任务培训时。为了缓解这一问题,我们在学习的时段之间使用独立学习和知识蒸馏(KD)方法。实验在TUT 2016/2017数据集上进行,其中包括4个声学场场类和25个音频事件类。拟议的递增式学习者以F1分54.4%和ASC任务以88.9%的精度逐步解决AT任务,在逐步完成一个解决 ASC 和 AT 的多任务系统时,我们使用这种系统。 ASC 的性能从最初的 ASC 精确度94.0% 起仅下降5.1%。</s>