The ability to learn continually without forgetting the past tasks is a desired attribute for artificial learning systems. Existing approaches to enable such learning in artificial neural networks usually rely on network growth, importance based weight update or replay of old data from the memory. In contrast, we propose a novel approach where a neural network learns new tasks by taking gradient steps in the orthogonal direction to the gradient subspaces deemed important for the past tasks. We find the bases of these subspaces by analyzing network representations (activations) after learning each task with Singular Value Decomposition (SVD) in a single shot manner and store them in the memory as Gradient Projection Memory (GPM). With qualitative and quantitative analyses, we show that such orthogonal gradient descent induces minimum to no interference with the past tasks, thereby mitigates forgetting. We evaluate our algorithm on diverse image classification datasets with short and long sequences of tasks and report better or on-par performance compared to the state-of-the-art approaches.
翻译:在不忘过去的任务的情况下持续学习的能力是人工学习系统的理想属性。在人工神经网络中进行这种学习的现有方法通常依赖于网络增长、基于重要性的重量更新或记忆中的旧数据重放。相反,我们提出一种新的方法,即神经网络通过向对过去的任务很重要的梯度子空间的正向梯度步骤学习新任务。我们通过以单一镜头方式对Singulal value decomposition(SVD)的每个任务进行分析,然后将其存储在记忆中,作为渐变投射内存(GPM),我们通过定性和定量分析,表明这种垂直梯度梯度下降最小地不会干扰过去的任务,从而减轻遗忘。我们用短长的任务序列来评估不同图像分类数据集的算法,并报告与最先进的方法相比更好的业绩。