This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks.
翻译:本文提出了一个新的模式,用于通过引入自我注意来提取可解释的句子嵌入。 我们不是使用矢量,而是使用二维矩阵来代表嵌入,每行的矩阵都参加句子的不同部分。 我们还建议了自留机制和模型的特别正规化术语。 作为副作用,嵌入是一个简单的方式,可以直观嵌入句子的具体部分。我们评估了我们关于三种不同任务的模型:作者特征分析、情绪分类和文字要求。结果显示,与所有3项任务中其他句子嵌入方法相比,我们的模型取得了显著的绩效收益。