Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data. Most recent analysis has focused on model outputs (e.g., language model surprisal) or internal vector representations (e.g., probing classifiers). Complementary to these works, we propose methods for analyzing the attention mechanisms of pre-trained models and apply them to BERT. BERT's attention heads exhibit patterns such as attending to delimiter tokens, specific positional offsets, or broadly attending over the whole sentence, with heads in the same layer often exhibiting similar behaviors. We further show that certain attention heads correspond well to linguistic notions of syntax and coreference. For example, we find heads that attend to the direct objects of verbs, determiners of nouns, objects of prepositions, and coreferent mentions with remarkably high accuracy. Lastly, we propose an attention-based probing classifier and use it to further demonstrate that substantial syntactic information is captured in BERT's attention.
翻译:最近的分析侧重于模型输出(例如语言模型超正)或内部矢量代表(例如标定分类器)。作为对这些工作的补充,我们提出了分析预先培训模型的注意机制的方法,并将其应用到BERT。BERT的注意负责人的注意力展示模式,如参加定界标牌、特定位置偏移,或广泛参加整个句子,同一层的负责人往往表现出相似的行为。我们进一步表明,某些关注对象与语法和共同参照的语言概念非常吻合。例如,我们发现那些关注动词的直接物体的负责人,确定名词、预置对象和核心人物的确定者,并以非常高的精确度提及。最后,我们建议以关注为基点的标本分类器,并使用它进一步表明大量同步信息在BERT的注意力中被捕获。