Open Information Extraction (OIE) is the task of the unsupervised creation of structured information from text. OIE is often used as a starting point for a number of downstream tasks including knowledge base construction, relation extraction, and question answering. While OIE methods are targeted at being domain independent, they have been evaluated primarily on newspaper, encyclopedic or general web text. In this article, we evaluate the performance of OIE on scientific texts originating from 10 different disciplines. To do so, we use two state-of-the-art OIE systems applying a crowd-sourcing approach. We find that OIE systems perform significantly worse on scientific text than encyclopedic text. We also provide an error analysis and suggest areas of work to reduce errors. Our corpus of sentences and judgments are made available.
翻译:开放信息提取系统(OIE)是不受监督地创建文本结构化信息的任务。 OIE经常被用作一系列下游任务的起点,包括知识库建设、关系提取和问答等。虽然OIE的方法是针对领域独立的,但主要在报纸、百科全书或一般网络文本上进行了评估。在本篇文章中,我们评估OIE在来自10个不同学科的科学文本方面的表现。为此,我们使用两个最先进的OIE系统,采用众包办法。我们发现,OIE系统在科学文本上的表现比百科全书要差得多。我们还提供了错误分析,并提出了减少错误的工作领域建议。我们提供了大量的判决和判决。