We begin by reiterating that common neural network activation functions have simple Bayesian origins. In this spirit, we go on to show that Bayes's theorem also implies a simple recurrence relation; this leads to a Bayesian recurrent unit with a prescribed feedback formulation. We show that introduction of a context indicator leads to a variable feedback that is similar to the forget mechanism in conventional recurrent units. A similar approach leads to a probabilistic input gate. The Bayesian formulation leads naturally to the two pass algorithm of the Kalman smoother or forward-backward algorithm, meaning that inference naturally depends upon future inputs as well as past ones. Experiments on speech recognition confirm that the resulting architecture can perform as well as a bidirectional recurrent network with the same number of parameters as a unidirectional one. Further, when configured explicitly bidirectionally, the architecture can exceed the performance of a conventional bidirectional recurrence.
翻译:我们首先重申共同的神经网络激活功能具有简单的贝叶斯起源。 本着这一精神,我们继续表明贝耶斯的理论也意味着简单的重复发生关系; 这导致一个有规定的反馈配方的贝耶斯经常性单位。 我们表明,引入上下文指标会导致类似于常规经常单位中遗忘机制的可变反馈。 类似的方法会导致概率输入门。 巴耶斯式的配方自然导致卡尔曼光滑或前向后向算法的两种传算法,这意味着推断自然取决于未来的投入和过去的输入。 对语音识别的实验证实,由此形成的结构可以运行,并且是一个双向经常性网络,其参数数量与单向的相同。此外,如果明确进行双向配置,则该结构会超过常规双向复发的功能。