Text encoding is one of the most important steps in Natural Language Processing (NLP). It has been done well by the self-attention mechanism in the current state-of-the-art Transformer encoder, which has brought about significant improvements in the performance of many NLP tasks. Though the Transformer encoder may effectively capture general information in its resulting representations, the backbone information, meaning the gist of the input text, is not specifically focused on. In this paper, we propose explicit and implicit text compression approaches to enhance the Transformer encoding and evaluate models using this approach on several typical downstream tasks that rely on the encoding heavily. Our explicit text compression approaches use dedicated models to compress text, while our implicit text compression approach simply adds an additional module to the main model to handle text compression. We propose three ways of integration, namely backbone source-side fusion, target-side fusion, and both-side fusion, to integrate the backbone information into Transformer-based models for various downstream tasks. Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines. We therefore conclude, when comparing the encodings to the baseline models, text compression helps the encoders to learn better language representations.
翻译:文本编码是自然语言处理( NLP) 中最重要的步骤之一。 在目前最先进的变换器编码器中,自留机制取得了良好的成绩,使许多变换器编码器的任务的性能有了显著的改进。虽然变换器编码器可能有效地捕捉其最终表述中的一般信息,但主干信息,即输入文本的格子,没有具体关注。在本文件中,我们提出了明确和隐含的文本压缩方法,用这种方法加强变换器编码,并用这种方法评价大量依赖编码的几种典型下游任务的模型。我们明确的文本压缩方法使用专门的模型压缩文本,而我们隐含的文字压缩方法只是给处理文字压缩的主要模型增加了一个额外的模块。我们提出了三种整合方法,即主干源-源-面融合、目标-面融合和双面融合,将主干信息纳入以变压器为基础的模式,用于各种下游任务。我们对基准数据集的评估表明,拟议的明确和隐含的文本压缩方法比强的基线改进了结果。因此,我们的结论是,在将格式与基准模型进行比较时,我们的结论是,有助于将语言压缩格式与基准模型进行比较。