The attention mechanism has been widely used in deep neural networks as a model component. By now, it has become a critical building block in many state-of-the-art natural language models. Despite its great success established empirically, the working mechanism of attention has not been investigated at a sufficient theoretical depth to date. In this paper, we set up a simple text classification task and study the dynamics of training a simple attention-based classification model using gradient descent. In this setting, we show that, for the discriminative words that the model should attend to, a persisting identity exists relating its embedding and the inner product of its key and the query. This allows us to prove that training must converge to attending to the discriminative words when the attention output is classified by a linear classifier. Experiments are performed, which validate our theoretical analysis and provide further insights.
翻译:关注机制在深层神经网络中被广泛用作模型组成部分,现在它已成为许多最先进的自然语言模型中的关键基石。尽管它取得了巨大的成功,但迄今为止还没有在足够的理论深度上对关注工作机制进行调查。在本文中,我们设置了一个简单的文本分类任务,并研究了培训使用梯度下降的简单关注分类模式的动态。在这个背景下,我们显示,由于模型应该关注的带有歧视性的词句,其嵌入及其关键和查询的内在产品存在一种顽固的特征。这使我们能够证明,当关注产出被线性分类者分类时,培训必须集中到歧视性的词句上。进行了实验,这些实验证实了我们的理论分析并提供了进一步的见解。