The Backprop algorithm for learning in neural networks utilizes two mechanisms: first, stochastic gradient descent and second, initialization with small random weights, where the latter is essential to the effectiveness of the former. We show that in continual learning setups, Backprop performs well initially, but over time its performance degrades. Stochastic gradient descent alone is insufficient to learn continually; the initial randomness enables only initial learning but not continual learning. To the best of our knowledge, ours is the first result showing this degradation in Backprop's ability to learn. To address this degradation in Backprop's plasticity, we propose an algorithm that continually injects random features alongside gradient descent using a new generate-and-test process. We call this the \textit{Continual Backprop} algorithm. We show that, unlike Backprop, Continual Backprop is able to continually adapt in both supervised and reinforcement learning (RL) problems. Continual Backprop has the same computational complexity as Backprop and can be seen as a natural extension of Backprop for continual learning.
翻译:在神经网络中学习的后方算法使用两种机制:第一,随机梯度下降,第二,以小随机重量初始化,后者对于前者的有效性至关重要。我们显示,在连续学习设置中,后方偏移最初表现良好,但随着时间的推移,其性能会下降。单是隐性梯度下降不足以持续学习;初始随机性只能允许初始学习,而不是持续学习。据我们所知,我们的最初随机性是显示后方偏移能力下降的第一个结果。为解决后方偏移性中的这种退化,我们建议一种算法,用新的生成和测试程序,不断将随机特性与梯度下降同时注射。我们称之为“持续反向偏移”算法。我们表明,与后方不同的是,持续反向偏移法能够持续适应受监管的和强化学习(RL)问题。连续反向偏移法具有与后方相同的计算复杂性,并且可以被视为持续学习的自然反向扩展。