Causal language modeling (LM) uses word history to predict the next word. BERT, on the other hand, makes use of bi-directional word information in a sentence to predict words at masked positions. While BERT is effective in sequence encoding, it is non-causal by nature and is not designed for sequence generation. In this paper, we propose a novel language model, SUffix REtrieval-Augmented LM (SUREALM), that simulates a bi-directional contextual effect in an autoregressive manner. SUREALM employs an embedding retriever to search for training sentences in a data store that share similar word history during sequence generation. In particular, the suffix portions of the retrieved sentences mimick the "future" context. We evaluated our proposed model on the DSTC9 spoken dialogue corpus and showed promising word perplexity reduction on the validation and test set compared to competitive baselines.
翻译:Causal 语言建模( LM) 使用单词历史来预测下一个单词 。 另一方面, BERT 在句子中使用双向单词信息来预测隐蔽位置的单词 。 虽然 BERT在序列编码中有效, 但它在性质上是非因果的, 并且不是为序列生成而设计的 。 在本文中, 我们提出了一个新颖的语言模型, SUffix REtrieval- Augive LM( SUREALM), 以自动回归的方式模拟双向背景效应 。 SINALM 使用一个嵌入式检索器来在序列生成期间有着类似文字历史的数据存储中搜索培训句子 。 特别是, 回收的句子的后缀部分会模糊“ 未来 ” 。 我们评估了 DSTC9 语音对话框中的拟议模型, 并展示了与竞争性基线相比, 验证和测试设定的单词有希望的多词性减少 。