知识驱动的数据构建,用于在常识问题回答中进行零镜头评价 (Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering)

Recent developments in pre-trained neural language modeling have led to leaps in accuracy on commonsense question-answering benchmarks. However, there is increasing concern that models overfit to specific tasks, without learning to utilize external knowledge or perform general semantic reasoning. In contrast, zero-shot evaluations have shown promise as a more robust measure of a model's general reasoning abilities. In this paper, we propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks. Guided by a set of hypotheses, the framework studies how to transform various pre-existing knowledge resources into a form that is most effective for pre-training models. We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks. Extending on prior work, we devise and compare four constrained distractor-sampling strategies. We provide empirical results across five commonsense question-answering tasks with data generated from five external knowledge resources. We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks. In addition, both preserving the structure of the task as well as generating fair and informative questions help language models learn more effectively.

翻译：培训前神经语言模型的近期发展导致常识问答基准的准确性突飞猛进。然而,人们日益关切的是,模型在不学习利用外部知识或进行一般语义推理的情况下,超越了具体任务,而没有学习如何利用外部知识或进行一般语义推理推理。相反,零点评价显示,作为模型一般推理能力的一种更强有力的衡量方法,前景大有希望。在本文件中,我们提出了一个新颖的神经同步框架,用于在共识任务中回答零点问题。在一套假设的指导下,框架研究如何将各种先前存在的知识资源转化为对培训前模式最为有效的形式。我们改变了一套语言模式、培训制度、知识来源和数据生成战略,并衡量其跨任务的影响。在以往的工作中,我们设计并比较了四种受限制的分散式抽样战略。我们用五个外部知识资源生成的数据提供五个常见的问答任务的经验结果。我们表明,虽然个人知识图表更适合具体的任务,但全球知识图表能够在不同任务中取得一致的收益。此外,我们还有效地保留了任务的结构,作为公正的学习。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/