Large language models (LLMs), such as GPT-3 and ChatGPT, have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning, which involves inference based on a few demonstration examples. Despite their successes in NLP tasks, no investigation has been conducted to assess the ability of LLMs to perform document information extraction (DIE) using in-context learning. Applying LLMs to DIE poses two challenges: the modality and task gap. To this end, we propose a simple but effective in-context learning framework called ICL-D3IE, which enables LLMs to perform DIE with different types of demonstration examples. Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations for benefiting all test instances. We design demonstrations describing relationships that enable LLMs to understand positional relationships. We introduce formatting demonstrations for easy answer extraction. Additionally, the framework improves diverse demonstrations by updating them iteratively. Our experiments on three widely used benchmark datasets demonstrate that the ICL-D3IE framework enables GPT-3/ChatGPT to achieve superior performance when compared to previous pre-trained methods fine-tuned with full training in both the in-distribution (ID) setting and in the out-of-distribution (OOD) setting.
翻译:大型语言模型(LLMs),如GPT-3和ChatGPT,在上下文学习中表现出了出色的成果,这涉及到基于几个演示示例的推理。尽管它们在NLP任务中取得了成功,但尚未进行研究以评估LLMs使用上下文学习执行文档信息提取(DIE)的能力。将LLMs应用于DIE面临着两个挑战:模态和任务差距。为此,我们提出了一个简单但有效的上下文学习框架,称为ICL-D3IE,它使LLMs能够使用不同类型的演示示例执行DIE。具体来说,我们从难训练文档中提取最困难和不同的片段作为困难演示,以使所有测试实例受益。我们设计描述关系的演示,使LLMs能够理解位置关系。我们引入格式演示以进行易于答案提取。此外,该框架通过迭代更新不同的演示。我们在三个广泛使用的基准数据集上的实验表明,ICL-D3IE框架使GPT-3 / ChatGPT在ID设置和OOD设置中的性能均优于之前使用完整训练微调的预训练方法。