Open-Domain Question Answering (ODQA) requires models to answer factoid questions with no context given. The common way for this task is to train models on a large-scale annotated dataset to retrieve related documents and generate answers based on these documents. In this paper, we show that the ODQA architecture can be dramatically simplified by treating Large Language Models (LLMs) as a knowledge corpus and propose a Self-Prompting framework for LLMs to perform ODQA so as to eliminate the need for training data and external knowledge corpus. Concretely, we firstly generate multiple pseudo QA pairs with background passages and one-sentence explanations for these QAs by prompting LLMs step by step and then leverage the generated QA pairs for in-context learning. Experimental results show our method surpasses previous state-of-the-art methods by +8.8 EM averagely on three widely-used ODQA datasets, and even achieves comparable performance with several retrieval-augmented fine-tuned models.
翻译:开放式问题解答(ODQA) 需要模型来解答没有上下文的事实问题。 这项任务的共同方法是在大型附加说明的数据集上培训模型,以检索相关文件,并根据这些文件生成答案。 在本文中,我们表明ODQA结构可以通过将大语言模型(LLMS)作为知识库来大大简化,并提出LLMS进行ODQA的自我促进框架,从而消除培训数据和外部知识库的需要。具体地说,我们首先生成多对配对的配对,配有背景段落和一例说明,通过一步步地提示LOMS,然后利用生成的QA对,进行同文学习。实验结果显示,我们的方法在三种广泛使用的ODQA数据集上平均超过8.8 EM,甚至以若干检索微调模型取得可比较的业绩。