基于变换器的端至端至端问题生成 (Transformer-based End-to-End Question Generation)

Question generation (QG) is a natural language generation task where a model is trained to ask questions corresponding to some input text. Most recent approaches frame QG as a sequence-to-sequence problem and rely on additional features and mechanisms to increase performance; however, these often increase model complexity, and can rely on auxiliary data unavailable in practical use. A single Transformer-based unidirectional language model leveraging transfer learning can be used to produce high quality questions while disposing of additional task-specific complexity. Our QG model, finetuned from GPT-2 Small, outperforms several paragraph-level QG baselines on the SQuAD dataset by 0.95 METEOR points. Human evaluators rated questions as easy to answer, relevant to their context paragraph, and corresponding well to natural human speech. Also introduced is a new set of baseline scores on the RACE dataset, which has not previously been used for QG tasks. Further experimentation with varying model capacities and datasets with non-identification type questions is recommended in order to further verify the robustness of pretrained Transformer-based LMs as question generators.

翻译：问题生成(QG)是一项天然语言生成任务,在这种任务中,一个模型经过培训,可以提出与某些输入文本相应的问题。大多数最新方法将QG作为一个顺序到顺序的问题,并依靠额外的特征和机制来提高绩效;然而,这些往往会增加模型复杂性,而且可以依赖实际使用中无法获得的辅助数据。一个单一的基于变异器的单向单向语言模式,利用转让学习来生成高质量的问题,同时处理额外任务的复杂性。我们从GPT-2 SmL微调出来的QG模型优于由0.95 METEOR点组成的SQuAD数据集的若干段落级QG基线。人类评价员将问题评为容易回答的问题,与其上下文段落相关,并与自然人讲话相对应。还引入了RACE数据集的一套新的基线分数,该数据集以前没有用于QG任务。建议用不同模型能力和非识别型问题数据集进行进一步实验,以便进一步核实以问题发电机为训练有素前的LMS的精密性。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【ACL2020】用于生成深度问题的语义图，Semantic Graphs for Generating Deep Questions

专知会员服务

26+阅读 · 2020年5月5日