Short-term memory in standard, general-purpose, sequence-processing recurrent neural networks (RNNs) is stored as activations of nodes or "neurons." Generalising feedforward NNs to such RNNs is mathematically straightforward and natural, and even historical: already in 1943, McCulloch and Pitts proposed this as a surrogate to "synaptic modifications" (in effect, generalising the Lenz-Ising model, the first non-sequence processing RNN architecture of the 1920s). A lesser known alternative approach to storing short-term memory in "synaptic connections" -- by parameterising and controlling the dynamics of a context-sensitive time-varying weight matrix through another NN -- yields another "natural" type of short-term memory in sequence processing NNs: the Fast Weight Programmers (FWPs) of the early 1990s. FWPs have seen a recent revival as generic sequence processors, achieving competitive performance across various tasks. They are formally closely related to the now popular Transformers. Here we present them in the context of artificial NNs as an abstraction of biological NNs -- a perspective that has not been stressed enough in previous FWP work. We first review aspects of FWPs for pedagogical purposes, then discuss connections to related works motivated by insights from neuroscience.
翻译:标准、一般目的、序列处理经常性神经网络(RNN)的短期内存是作为节点或“中枢”的激活而储存的。一般地向这种区域NNN提供反馈的NNS是数学上直截了当的、自然的,甚至历史的:早在1943年,McCulloch和Pitts就提议作为“合成修改”(实际上,概括Lenz-Ising模型,即1920年代第一个非序列处理RNN的架构)的代名词。在“合成连接”中储存短期内存的较不为人所知的替代方法。通过另一个NNNN对环境敏感时间变化重量矩阵的动态进行参数比较和控制,这在顺序处理中产生另一种“自然”类型的短期内存:1990年代初期的快速光学程序员(FWPs)作为“合成程序员(FWPs)”的代名词。FWPs最近将恢复为一般的序列处理器,在各种任务中实现竞争性的性能。它们与现在流行的变换器有着密切的关系。我们在这里介绍的是短期的“综合的WP-WP ”中,我们从具有动机的神经进取的神经科学意义的链接中,从以前的FNNNNPs的理论学学研究角度上讨论了与以前的理论学学研究。