新闻故事的Quiz- Style 问题生成 (Quiz-Style Question Generation for News Stories)

A large majority of American adults get at least some of their news from the Internet. Even though many online news products have the goal of informing their users about the news, they lack scalable and reliable tools for measuring how well they are achieving this goal, and therefore have to resort to noisy proxy metrics (e.g., click-through rates or reading time) to track their performance. As a first step towards measuring news informedness at a scale, we study the problem of quiz-style multiple-choice question generation, which may be used to survey users about their knowledge of recent news. In particular, we formulate the problem as two sequence-to-sequence tasks: question-answer generation (QAG) and distractor, or incorrect answer, generation (DG). We introduce NewsQuizQA, the first dataset intended for quiz-style question-answer generation, containing 20K human written question-answer pairs from 5K news article summaries. Using this dataset, we propose a series of novel techniques for applying large pre-trained Transformer encoder-decoder models, namely PEGASUS and T5, to the tasks of question-answer generation and distractor generation. We show that our models outperform strong baselines using both automated metrics and human raters. We provide a case study of running weekly quizzes on real-world users via the Google Surveys platform over the course of two months. We found that users generally found the automatically generated questions to be educational and enjoyable. Finally, to serve the research community, we are releasing the NewsQuizQA dataset.

翻译：大部分美国成年人至少从互联网上获得一些新闻。尽管许多在线新闻产品的目标是让用户了解这些新闻,但他们缺乏可扩缩和可靠的工具来衡量他们实现该目标的程度,因此不得不使用噪音代理量(例如点击通速率或阅读时间)来跟踪他们的表现。作为衡量大规模新闻知情程度的第一步,我们研究了测试式多选择问题生成问题,这可以用来调查用户对最新新闻的了解。特别是,我们把问题分为两个顺序到顺序的任务:问答生成(QAG)和分流器,或错误的答案(DG)。我们介绍了用于测试式问答生成的第一个数据集(例如点击通速率或阅读时间 ) 。作为衡量规模新闻知情程度的第一步,我们研究了测试式的多选题生成问题生成问题生成问题。我们建议了一系列新的技术,用于应用大型预先培训的变换器解码模型。即PEGASUS和T5, 将问题分为两个序列: 社区生成问题解答器或错误的答复(DGDG ), 我们的在线数据解析器生成数据生成模型最终显示我们所找到的硬质数据生成的版本。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日