After a neural sequence model encounters an unexpected token, can its behavior be predicted? We show that RNN and transformer language models exhibit structured, consistent generalization in out-of-distribution contexts. We begin by introducing two idealized models of generalization in next-word prediction: a local context model in which generalization is consistent with the last word observed, and a global context model in which generalization is consistent with the global structure of the input. In experiments in English, Finnish, Mandarin, and random regular languages, we demonstrate that neural language models interpolate between these two forms of generalization: their predictions are well-approximated by a log-linear combination of local and global predictive distributions. We then show that, in some languages, noise mediates the two forms of generalization: noise applied to input tokens encourages global generalization, while noise in history representations encourages local generalization. Finally, we offer a preliminary theoretical explanation of these results by proving that the observed interpolation behavior is expected in log-linear models with a particular feature correlation structure. These results help explain the effectiveness of two popular regularization schemes and show that aspects of sequence model generalization can be understood and controlled.
翻译:在神经序列模型遇到一个意想不到的象征之后,能否预测其行为?我们展示出RNN和变压器语言模型在分配之外的情况下结构化和一致的概括化模式。我们首先在下一个词的预测中引入两种理想化的概括化模式:一种当地环境模型,其概括化与所观察到的最后一句话一致,另一种全球背景模型,其概括化与输入的全球结构一致。在英语、芬兰语、普通话和普通语言的随机实验中,我们证明神经语言模型在这两种形式的概括化中相互交错:它们的预测与本地和全球预测分布的逻辑-线性结合非常接近。我们然后表明,在某些语言中,噪音介质化两种形式:对输入符号的噪音鼓励全球的概括化,而历史表述中的噪音则鼓励地方的概括化。最后,我们对这些结果提供初步的理论解释,通过证明在具有特定特征关联结构的逻辑-线性模型中,观察到的内推论行为是预期的。这些结果有助于解释两种大众正规化计划的有效性,并表明模式的常规化的控制方面能够被理解。