Attention is an increasingly popular mechanism used in a wide range of neural architectures. Because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures for natural language processing, with a focus on architectures designed to work with vector representation of the textual data. We discuss the dimensions along which proposals differ, the possible uses of attention, and chart the major research activities and open challenges in the area.
翻译:注意力是各种神经结构中日益流行的一种机制,由于这一领域的快速进展,仍然缺乏对注意力的系统全面了解。在本条中,我们为自然语言处理的注意结构确定了统一的模式,重点是旨在与文本数据的矢量代表合作的构架。我们讨论了各种建议的不同层面、关注的可能用途以及该领域的主要研究活动和公开挑战。