We present an open-source tool for visualizing multi-head self-attention in Transformer-based language representation models. The tool extends earlier work by visualizing attention at three levels of granularity: the attention-head level, the model level, and the neuron level. We describe how each of these views can help to interpret the model, and we demonstrate the tool on the BERT model and the OpenAI GPT-2 model. We also present three use cases for analyzing GPT-2: detecting model bias, identifying recurring patterns, and linking neurons to model behavior.
翻译:我们提出了一个开放源码工具,用于在基于变异器的语言表述模型中直观多头语言自我关注。该工具通过直观地关注三个颗粒层次:注意头层、模型层次和神经层次,扩展了先前的工作。我们描述了这些观点中的每一种如何有助于解释模型,我们展示了BERT模型和OpenAI GPT-2模型的工具。我们还介绍了用于分析GPT-2的三种使用案例:发现模型偏差,确定反复出现的模式,以及将神经元与模型行为联系起来。