Biological neural networks are capable of recruiting different sets of neurons to encode different memories. However, when training artificial neural networks on a set of tasks, typically, no mechanism is employed for selectively producing anything analogous to these neuronal ensembles. Further, artificial neural networks suffer from catastrophic forgetting, where the network's performance rapidly deteriorates as tasks are learned sequentially. By contrast, sequential learning is possible for a range of biological organisms. We introduce Learned Context Dependent Gating (LXDG), a method to flexibly allocate and recall `artificial neuronal ensembles', using a particular network structure and a new set of regularization terms. Activities in the hidden layers of the network are modulated by gates, which are dynamically produced during training. The gates are outputs of networks themselves, trained with a sigmoid output activation. The regularization terms we have introduced correspond to properties exhibited by biological neuronal ensembles. The first term penalizes low gate sparsity, ensuring that only a specified fraction of the network is used. The second term ensures that previously learned gates are recalled when the network is presented with input from previously learned tasks. Finally, there is a regularization term responsible for ensuring that new tasks are encoded in gates that are as orthogonal as possible from previously used ones. We demonstrate the ability of this method to alleviate catastrophic forgetting on continual learning benchmarks. When the new regularization terms are included in the model along with Elastic Weight Consolidation (EWC) it achieves better performance on the benchmark `permuted MNIST' than with EWC alone. The benchmark `rotated MNIST' demonstrates how similar tasks recruit similar neurons to the artificial neuronal ensemble.
翻译:生物神经网络能够招聘不同种类的神经神经元来对不同的记忆进行编码。然而,在对人工神经神经网络进行一系列任务的培训时,通常没有使用任何机制来选择性地生产类似于这些神经环形的任何东西。此外,人工神经网络遭受灾难性的遗忘,因为随着任务的连续学习,网络的性能会随着任务的变化而迅速恶化。相反,对于一系列生物生物生物体来说,连续学习是可能的。我们引入了CEncle Incependation Dependation Gating(LDDDG),这是一种灵活分配和回顾“人工神经神经神经神经集合”的方法,它使用特定的网络结构以及一套类似的正规化术语。网络的隐藏层活动由大门进行调节,这些大门在培训期间动态地生成。大门是网络本身的产出,经过像样的输出。我们引入的正规化术语与生物神经团的特性相对应。第一个术语惩罚低端的门孔孔孔孔孔,确保网络中只使用一个指定的内脏的分数。第二个术语确保了以前学到的门在网络上被回忆的更清楚的内,当网络实现从历史的内更清晰的内基化任务时,并且展示了Ermal化任务中学习的内行的内行的功能, 。最后用一个我们用来用来显示的内行的内行的内行的内行的内行的内行的内行的内行的内行。最后的一个术语是用来显示的内行的内行的内行的内行的内行。最后的内行的内行。最后的内行。我们行的内行的内行的内行的内行。最后的内行。我们行。我们行的内行的内行的内行的内行的内行的内行的内行。