产生产生生成段落级问题生成语言模式 (Generative Language Models for Paragraph-Level Question Generation)

Powerful generative models have led to recent progress in question generation (QG). However, it is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches. In this paper, we introduce QG-Bench, a multilingual and multidomain benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting. It includes general-purpose datasets such as SQuAD for English, datasets from ten domains and two styles, as well as datasets in eight different languages. Using QG-Bench as a reference, we perform an extensive analysis of the capabilities of language models for the task. First, we propose robust QG baselines based on fine-tuning generative language models. Then, we complement automatic evaluation based on standard metrics with an extensive manual evaluation, which in turn sheds light on the difficulty of evaluating QG models. Finally, we analyse both the domain adaptability of these models as well as the effectiveness of multilingual models in languages other than English. QG-Bench is released along with the fine-tuned models presented in the paper https://github.com/asahi417/lm-question-generation, which are also available as a demo https://autoqg.net/.

翻译：· 然而,很难衡量QG研究的进展,因为没有标准化的资源,因此无法对各种方法进行统一比较。在本文件中,我们为QG引入了QG-Bench,这是QG的多语种和多域基准,通过将其转换为标准的QG设置,统一现有回答数据集的问题;它包括通用数据集,如英文的SQUAD、十个域和两种风格的数据集以及八种不同语文的数据集。我们利用QG-Bench作为参照,对任务语言模型的能力进行广泛分析。首先,我们根据微调的基因化语言模型,提出了强有力的QG-Bench。然后,我们用广泛的手工评估来补充基于标准计量的自动评价,这反过来又说明了评价QG模型的困难。最后,我们分析了这些模型的域适应性以及英文以外其他语文的多语种模型的有效性。QG-Bench作为参考,我们对任务语言模型的能力进行了广泛分析。首先,我们提出了基于精细调整的变义语言模型。然后,我们又提出了基于标准衡量的自动评价的QG-QQA/QQQA的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日