BERT has been used for solving commonsense tasks such as CommonsenseQA. While prior research has found that BERT does contain commonsense information to some extent, there has been work showing that pre-trained models can rely on spurious associations (e.g., data bias) rather than key cues in solving sentiment classification and other problems. We quantitatively investigate the presence of structural commonsense cues in BERT when solving commonsense tasks, and the importance of such cues for the model prediction. Using two different measures, we find that BERT does use relevant knowledge for solving the task, and the presence of commonsense knowledge is positively correlated to the model accuracy.
翻译:BERT被用于解决常识任务等常识任务。虽然先前的研究发现BERT确实在某种程度上包含常识信息,但已经开展了一些工作,表明在解决情绪分类和其他问题时,预先培训的模式可以依赖虚假的协会(如数据偏差),而不是关键线索。我们在解决常识任务时从数量上调查BERT中是否存在结构性常识信号,以及这种提示对模型预测的重要性。我们采用两种不同的措施发现,BERT确实使用相关知识解决任务,而普通知识的存在与模型准确性有着积极的相关性。