Commonsense knowledge plays an important role when we read. The performance of BERT on SQuAD dataset shows that the accuracy of BERT can be better than human users. However, it does not mean that computers can surpass the human being in reading comprehension. CommonsenseQA is a large-scale dataset which is designed based on commonsense knowledge. BERT only achieved an accuracy of 55.9% on it. The result shows that computers cannot apply commonsense knowledge like human beings to answer questions. Comprehension Ability Test (CAT) divided the reading comprehension ability at four levels. We can achieve human like comprehension ability level by level. BERT has performed well at level 1 which does not require common knowledge. In this research, we propose a system which aims to allow computers to read articles and answer related questions with commonsense knowledge like a human being for CAT level 2. This system consists of three parts. Firstly, we built a commonsense knowledge graph; and then automatically constructed the commonsense knowledge question dataset according to it. Finally, BERT is combined with the commonsense knowledge to achieve the reading comprehension ability at CAT level 2. Experiments show that it can pass the CAT as long as the required common knowledge is included in the knowledge base.
翻译:我们读到时,常识知识可发挥重要作用。 SQAD 数据集上的BERT的性能显示,BERT的准确性比人类用户要好。然而,这并不意味着计算机在阅读理解方面可以超越人。 普通常识QA是一个大型数据集,其设计以常识知识为基础。 BERT只能达到55.9%的准确性。结果显示计算机不能像人类那样应用常识知识回答问题。 理解能力测试(CAT)将理解能力能力分为四个层次。我们可以达到像人一样的理解能力水平。 BERT在1级的表现很好,不需要共同的知识。在这个研究中,我们提议了一个系统,目的是让计算机阅读文章并回答与常识知识有关的问题,例如人类对CAT2级的常识。 这个系统由三个部分组成。首先,我们建立了一个常识知识图表;然后根据它自动构建了普通知识问题数据集。最后,BERT与普通常识知识相结合,以便在CAT级别上实现理解能力。实验显示,计算机的常识素素基础包括CAT。