Previous work has established that RNNs with an unbounded activation function have the capacity to count exactly. However, it has also been shown that RNNs are challenging to train effectively and generally do not learn exact counting behaviour. In this paper, we focus on this problem by studying the simplest possible RNN, a linear single-cell network. We conduct a theoretical analysis of linear RNNs and identify conditions for the models to exhibit exact counting behaviour. We provide a formal proof that these conditions are necessary and sufficient. We also conduct an empirical analysis using tasks involving a Dyck-1-like Balanced Bracket language under two different settings. We observe that linear RNNs generally do not meet the necessary and sufficient conditions for counting behaviour when trained with the standard approach. We investigate how varying the length of training sequences and utilising different target classes impacts model behaviour during training and the ability of linear RNN models to effectively approximate the indicator conditions.
 翻译:先前的研究已经表明,具有无界激活函数的RNN具有完美的基数计数能力。然而,研究已经表明,有效训练和精确计数学习是RNN面临的挑战。本文使用最简单的RNN模型——线性单元模型来关注这个问题。我们对线性RNN进行了理论分析,并确定了模型呈现精确计数行为的条件。我们提供了一个形式化的证明,这些条件是必要且充分的。我们还使用涉及Dyck-1类平衡括号语言的两种不同设置的任务进行了实证分析。我们观察到,标准训练下的线性RNN通常不符合计数行为必要且充分的条件。我们研究了训练序列的长度变化以及利用不同的目标类如何影响模型在训练期间的行为,以及线性RNN模型在有效逼近指示条件方面的能力。