While deep neural networks have surpassed human performance in multiple situations, they are prone to catastrophic forgetting: upon training a new task, they rapidly forget previously learned ones. Neuroscience studies, based on idealized tasks, suggest that in the brain, synapses overcome this issue by adjusting their plasticity depending on their past history. However, such "metaplastic" behaviours do not transfer directly to mitigate catastrophic forgetting in deep neural networks. In this work, we interpret the hidden weights used by binarized neural networks, a low-precision version of deep neural networks, as metaplastic variables, and modify their training technique to alleviate forgetting. Building on this idea, we propose and demonstrate experimentally, in situations of multitask and stream learning, a training technique that reduces catastrophic forgetting without needing previously presented data, nor formal boundaries between datasets and with performance approaching more mainstream techniques with task boundaries. We support our approach with a theoretical analysis on a tractable task. This work bridges computational neuroscience and deep learning, and presents significant assets for future embedded and neuromorphic systems, especially when using novel nanodevices featuring physics analogous to metaplasticity.
翻译:尽管深神经网络在多种情况下超过了人类的性能,但它们很容易被灾难性地忘记:在培训新任务时,它们迅速忘记了以前学到的。根据理想化的任务,神经科学研究表明,在大脑中,根据过去的历史调整其可塑性,突触可以克服这一问题。然而,这种“视像性”行为并没有直接转移,以减缓深神经网络中的灾难性遗忘。在这项工作中,我们将二进制神经网络使用的隐藏重量、深神经网络的低精度版本,作为代谢性变异,并修改其培训技术,以缓解遗忘。基于这一想法,我们提议并实验地展示在多任务和流学的情况下,一种培训技术,这种培训技术可以减少灾难性的遗忘,而无需先前提供的数据,也无需正式的数据集和性能之间的界限,也无需在任务界限上接近更多的主流技术。我们支持我们的方法,对一项可移动的任务进行理论分析。这项工作将计算性神经科学和深层次学习作为计算性变异,为未来嵌入和神经变形系统提供了重要的资产,特别是使用新型的近似物理学的纳米系统。