The Transformer model is widely used in natural language processing for sentence representation. However, the previous Transformer-based models focus on function words that have limited meaning in most cases and could merely extract high-level semantic abstraction features. In this paper, two approaches are introduced to improve the performance of Transformers. We calculated the attention score by multiplying the part-of-speech weight vector with the correlation coefficient, which helps extract the words with more practical meaning. The weight vector is obtained by the input text sequence based on the importance of the part-of-speech. Furthermore, we fuse the features of each layer to make the sentence representation results more comprehensive and accurate. In experiments, we demonstrate the effectiveness of our model Transformer-F on three standard text classification datasets. Experimental results show that our proposed model significantly boosts the performance of text classification as compared to the baseline model. Specifically, we obtain a 5.28% relative improvement over the vanilla Transformer on the simple tasks.
翻译:变换器模型被广泛用于自然语言处理,用于判决代表。然而,以前的变换器模型侧重于在大多数情况下意义有限的功能词,可能只是提取高层次语义抽象特征。在本文中,引入了两种方法来提高变换器的性能。我们计算了注意分数的方法是,将音速部分的重量矢量乘以相关系数,这有助于以更实际的含义提取词数。权重矢量是通过输入文本序列获得的,其依据是发言部分的重要性。此外,我们整合了每一层的功能,以使句号代表结果更加全面和准确。在实验中,我们展示了我们三个标准文本分类数据集的变换器-F模型的有效性。实验结果表明,我们提议的模型极大地促进了文本分类与基线模型相比的绩效。具体地说,我们得到了比简单任务香草变换器5.28%的相对改进。