是什么使我们感到奇怪?分析一系列开放领域问题 (What makes us curious? analysis of a corpus of open-domain questions)

Every day people ask short questions through smart devices or online forums to seek answers to all kinds of queries. With the increasing number of questions collected it becomes difficult to provide answers to each of them, which is one of the reasons behind the growing interest in automated question answering. Some questions are similar to existing ones that have already been answered, while others could be answered by an external knowledge source such as Wikipedia. An important question is what can be revealed by analysing a large set of questions. In 2017, "We the Curious" science centre in Bristol started a project to capture the curiosity of Bristolians: the project collected more than 10,000 questions on various topics. As no rules were given during collection, the questions are truly open-domain, and ranged across a variety of topics. One important aim for the science centre was to understand what concerns its visitors had beyond science, particularly on societal and cultural issues. We addressed this question by developing an Artificial Intelligence tool that can be used to perform various processing tasks: detection of equivalence between questions; detection of topic and type; and answering of the question. As we focused on the creation of a "generalist" tool, we trained it with labelled data from different datasets. We called the resulting model QBERT. This paper describes what information we extracted from the automated analysis of the WTC corpus of open-domain questions.

翻译：每天都有人通过智能设备或在线论坛提出短问,以寻找所有询问的答案。随着所收集的问题越来越多,很难对每个问题都给出答案,这也是人们对自动回答的兴趣日益浓厚的原因之一。有些问题与已经回答的现有问题相似,而另一些问题则可以由外部知识来源(如维基百科)回答。一个重要的问题是,通过分析大量问题可以揭示出什么。2017年,布里斯托尔的“我们好奇的”科学中心启动了一个项目,捕捉布里斯托利亚斯的好奇心:该项目收集了1万多个关于不同主题的问题。由于在收集过程中没有给出任何规则,因此问题都是真正开放的,而且跨越了各种主题。科学中心的一个重要目的是了解其访客所关心的超出科学范围的问题,特别是社会和文化问题。我们通过开发一个能够用于执行各种处理任务的人工智能信息工具来解决这个问题:发现问题之间的等同性;发现主题和类型;回答问题。当我们专注于创建“直观”的模型工具时,我们从各种主题中提出了问题。我们训练科学中心的一个重要目的是了解它从科学、尤其是社会和文化问题中获取的数据。我们从数据库中获取了什么。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日