The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora.
翻译:最近推出的BERT模型在几种语言理解基准方面表现良好。 在本文中,我们描述为常识推理而简单地重新实施BERT。我们表明,BERT产生的注意力可以直接用于诸如Pronoun Disabilation Distrible and Winograd Schema Challenge等任务。我们提议的引人注意的常识推理方法在概念上是简单的,但在经验上是强大的。对多个数据集的实验分析表明,我们提议的系统在所有案例中都表现得非常好,但比以前报告的先进程度高一些。虽然结果显示,BERT似乎隐含地学会在实体之间建立复杂的关系,但解决常识推理任务可能需要的不仅仅是从巨大的文本体体体体中学习的不受监督的模式。