Recent advances in large-scale pre-training such as GPT-3 allow seemingly high quality text to be generated from a given prompt. However, such generation systems often suffer from problems of hallucinated facts, and are not inherently designed to incorporate useful external information. Grounded generation models appear to offer remedies, but their training typically relies on rarely-available parallel data where corresponding documents are provided for context. We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal. The model learns to retrieve the documents with the highest utility in generation and attentively combines them in the output. We demonstrate that by taking advantage of external references our approach can produce more informative and interesting text in both prose and dialogue generation.
翻译:在诸如GPT-3等大规模培训前阶段的最近进展中,通过某种及时性,可以产生出质量似乎很高的文本。然而,这类生成系统往往会遇到幻觉事实的问题,其设计本身并非旨在纳入有用的外部信息。有源的生成模式似乎提供了补救措施,但其培训通常依赖于提供相应文件的极少获得的平行数据。我们建议了一个框架,通过联合培训一个有根有据的生成者和文件检索者了解语言模型信号来缓解这一数据限制。模型学会检索文件,其生成效果最大,并在产出中仔细结合。我们证明,通过利用外部参考,我们的方法可以在编程和对话生成中产生更多内容和有趣的文字。