In a conversational question answering scenario, a questioner seeks to extract information about a topic through a series of interdependent questions and answers. As the conversation progresses, they may switch to related topics, a phenomenon commonly observed in information-seeking search sessions. However, current datasets for conversational question answering are limiting in two ways: 1) they do not contain topic switches; and 2) they assume the reference text for the conversation is given, i.e., the setting is not open-domain. We introduce TopiOCQA (pronounced Tapioca), an open-domain conversational dataset with topic switches on Wikipedia. TopiOCQA contains 3,920 conversations with information-seeking questions and free-form answers. On average, a conversation in our dataset spans 13 question-answer turns and involves four topics (documents). TopiOCQA poses a challenging test-bed for models, where efficient retrieval is required on multiple turns of the same conversation, in conjunction with constructing valid responses using conversational history. We evaluate several baselines, by combining state-of-the-art document retrieval methods with neural reader models. Our best model achieves F1 of 55.8, falling short of human performance by 14.2 points, indicating the difficulty of our dataset. Our dataset and code is available at https://mcgill-nlp.github.io/topiocqa
翻译:在一次对话问答中,一个提问者试图通过一系列互相依存的问答来获取关于一个主题的信息。随着对话的进展,他们可能会转换为相关主题,这是一个在信息搜索会中常见的现象。然而,当前对话问答的数据集有两种限制:(1) 它们不包含主题开关;(2) 它们假定对话的参考文本是给出的, 即, 设置不是开放的。 我们介绍TopiOCQA( 发布塔皮奥卡 ), 这是在维基百科上使用主题开关的开放多端对话数据集。 TopiOCQA 包含3 920个与信息查询问题和自由形式答案的对话。 平均而言, 我们的数据集中的对话跨越13个问答转盘,涉及四个主题( 文件)。 托皮奥卡为模型提供了一个挑战性测试台, 在同一对话的多轮转上需要高效检索, 并使用谈话史构建有效的回应。 我们通过将状态文件检索方法与神经阅读模型组合组合组合的3, 920 显示我们现有F2 数据运行的难度。