The Backprop algorithm for learning in neural networks utilizes two mechanisms: first, stochastic gradient descent and second, initialization with small random weights, where the latter is essential to the effectiveness of the former. We show that in continual learning setups, Backprop performs well initially, but over time its performance degrades. Stochastic gradient descent alone is insufficient to learn continually; the initial randomness enables only initial learning but not continual learning. To the best of our knowledge, ours is the first result showing this degradation in Backprop's ability to learn. To address this issue, we propose an algorithm that continually injects random features alongside gradient descent using a new generate-and-test process. We call this the Continual Backprop algorithm. We show that, unlike Backprop, Continual Backprop is able to continually adapt in both supervised and reinforcement learning problems. We expect that as continual learning becomes more common in future applications, a method like Continual Backprop will be essential where the advantages of random initialization are present throughout learning.
翻译:在神经网络中学习的后方算法使用两种机制:第一,随机梯度下降,第二,以小随机重量初始化,后者对于前者的有效性至关重要。我们显示,在连续学习设置中,后方法最初运行良好,但随着时间的推移,其性能会下降。单是隐性梯度下降不足以持续学习;初始随机性只能允许初始学习,而不是持续学习。根据我们的最佳知识,我们的最初结果就是显示后方法学习能力下降的第一个结果。为了解决这一问题,我们建议一种不断将随机特性与梯度下降同时输入的算法,使用新的生成和测试程序。我们称之为连续反向法算法。我们表明,与后方法不同的是,持续后方法能够不断适应受监管的和强化的学习问题。我们期望,随着持续学习在今后的应用中变得更加常见,在随机初始化的好处方面,像连续反向法这样的方法将是必不可少的。