Long short-term memory (LSTM) networks and their variants are capable of encapsulating long-range dependencies, which is evident from their performance on a variety of linguistic tasks. On the other hand, simple recurrent networks (SRNs), which appear more biologically grounded in terms of synaptic connections, have generally been less successful at capturing long-range dependencies as well as the loci of grammatical errors in an unsupervised setting. In this paper, we seek to develop models that bridge the gap between biological plausibility and linguistic competence. We propose a new architecture, the Decay RNN, which incorporates the decaying nature of neuronal activations and models the excitatory and inhibitory connections in a population of neurons. Besides its biological inspiration, our model also shows competitive performance relative to LSTMs on subject-verb agreement, sentence grammaticality, and language modeling tasks. These results provide some pointers towards probing the nature of the inductive biases required for RNN architectures to model linguistic phenomena successfully.
翻译:长期短期记忆(LSTM)网络及其变体能够包涵长距离依赖性,这从其在各种语言任务方面的表现中可以明显看出。另一方面,简单经常性网络(SRNs)似乎在生物学上更建立在合成连接上,通常不太成功地捕捉到长距离依赖性,以及在不受监督的环境中的语法错误。在本文中,我们力求开发一些模型,弥合生物可视性和语言能力之间的差距。我们提议了一个新的结构,即Decay RNN,它包含神经活化机能的衰变性质,并模拟神经组群中的显性与抑制性联系。除了生物灵感外,我们的模型还展示了与LSTMs在主题-语言协议、句文法和语言建模任务方面的竞争性性。这些结果为证明RNS结构成功模拟语言现象所需的诱导性偏见的性质提供了一些指针。