We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). The interactions involve two crowd workers: (1) a student who poses a sequence of freeform questions to learn as much as possible about a hidden Wikipedia text, and (2) a teacher who answers the questions by providing short excerpts from the text. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as we show in a detailed qualitative evaluation. We also report results for a number of reference models, including a recently state-of-the-art reading comprehension architecture extended to model dialog context. Our best model underperforms humans by 20 F1, suggesting that there is significant room for future work on this data. Dataset, baseline, and leaderboard are available at quac.ai.
翻译:我们提出QuAC,这是用于背景问答的数据集,包含14K信息搜索QA对话框(总共100K问题),互动涉及两名人群工人:(1) 学生提出一系列自由形式问题,以尽可能了解隐藏的维基百科文本;(2) 教师回答问题,提供文本的简短摘录。QuAC提出了现有机器理解数据集中找不到的挑战:如我们在详细的质量评估中所示,它的问题往往更开放,无法回答,或只在对话背景下有意义。我们还报告了一些参考模型的结果,包括最近一个最先进的阅读理解架构,扩展至模式对话背景。我们最好的模型在20F1之前低于人类,表明今后关于这一数据的工作有很大的空间。数据集、基线和领导板可在四.ai查阅。