Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present RetoMaton - retrieval automaton - which approximates the datastore search, based on (1) saving pointers between consecutive datastore entries, and (2) clustering of entries into "states". This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list. The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity by up to 1.85, or alternatively saves up to 83% of the nearest neighbor searches over $k$NN-LM (Khandelwal et al., 2020) without hurting perplexity. Our code and trained models are available at https://github.com/neulab/retomaton .
翻译:以Retrieval为基础的语言模型( R- LM) 通过将标准语言模型( LM) 与测试时从外部数据存储处获取的示例合并, 来模拟自然语言文本的概率。 虽然在实际操作中,使用这些模型的一个主要瓶颈是计算成本高昂的数据存储搜索, 它可以像每一步一样频繁地进行。 在本文中, 我们介绍 RetoMaton - 检索自动maton - 与数据存储器的搜索相近, 其依据是:(1) 数据存储器连续条目之间的保存点, (2) 将条目分组成“ 状态 ” 。 这实际上导致在数据存储器顶部建起一个加权的有限自动图示, 而不是将数据存储器作为平板列表。 自动图的创建是不受监督的, 并且可以从任何文本收藏中构建一个 RetoMaton : 要么原始的训练资料堆, 要么来自另一个域 。 与 LMEM 的推断值同步, 将数据存储器降低到1.85, 或者将最近邻居搜索的 83% $K / alcomreadreal。