Natural language understanding tasks such as open-domain question answering often require retrieving and assimilating factual information from multiple sources. We propose to address this problem by integrating a semi-parametric representation of a large text corpus into a Transformer model as a source of factual knowledge. Specifically, our method represents knowledge with `mention memory', a table of dense vector representations of every entity mention in a corpus. The proposed model - TOME - is a Transformer that accesses the information through internal memory layers in which each entity mention in the input passage attends to the mention memory. This approach enables synthesis of and reasoning over many disparate sources of information within a single Transformer model. In experiments using a memory of 150 million Wikipedia mentions, TOME achieves strong performance on several open-domain knowledge-intensive tasks, including the claim verification benchmarks HoVer and FEVER and several entity-based QA benchmarks. We also show that the model learns to attend to informative mentions without any direct supervision. Finally we demonstrate that the model can generalize to new unseen entities by updating the memory without retraining.
翻译:自然语言理解任务,如开放式回答问题,往往需要从多种来源检索和吸收事实信息。我们提议通过将大文本材料的半参数表示作为事实知识来源纳入变异模型来解决这一问题。具体地说,我们的方法代表着“感知记忆”的知识,即每个实体在文体中提及的密集矢量表示表。提议的模型“TOME”是一个通过内部记忆层获取信息的变异器,每个实体在输入段落中提及的信息都包含在提及记忆中。这一方法能够综合和推理一个单一变异模型中的许多不同信息来源。在使用1.5亿维基百科记忆的实验中,托米在几项开放知识密集型任务上取得了很强的业绩,包括要求核查基准HoVer和FEver以及若干基于实体的QA基准。我们还表明,模型学会在没有任何直接监督的情况下提供信息。我们最后证明,该模型可以通过不再培训更新记忆而将新的隐蔽实体归纳为新的无形实体。