Transformer is important for text modeling. However, it has difficulty in handling long documents due to the quadratic complexity with input text length. In order to handle this problem, we propose a hierarchical interactive Transformer (Hi-Transformer) for efficient and effective long document modeling. Hi-Transformer models documents in a hierarchical way, i.e., first learns sentence representations and then learns document representations. It can effectively reduce the complexity and meanwhile capture global document context in the modeling of each sentence. More specifically, we first use a sentence Transformer to learn the representations of each sentence. Then we use a document Transformer to model the global document context from these sentence representations. Next, we use another sentence Transformer to enhance sentence modeling using the global document context. Finally, we use hierarchical pooling method to obtain document embedding. Extensive experiments on three benchmark datasets validate the efficiency and effectiveness of Hi-Transformer in long document modeling.
翻译:变换器对于文本建模很重要。 但是, 它由于输入文本长度的二次复杂, 难以处理长文件。 为了解决这一问题, 我们建议使用一个等级互动变换器( Hi- Transfer) 来高效和高效的长文件建模。 高变换器文件以分级方式, 即先学习句子表达方式, 然后学习文件表达方式 。 它可以有效地降低每个句子建模的复杂程度, 同时捕捉全球文档背景 。 更具体地说, 我们首先使用一句变换器来学习每个句子的表述方式 。 然后我们用一个文档变换器来模拟这些句子的全文件背景 。 接下来, 我们用另一句变换器来利用全球文档背景加强句子建模 。 最后, 我们使用分级合并法来获取文件嵌套装。 在三个基准数据集上进行的广泛实验, 验证了长文档建模过程中的H- Transtrafer的效率和效力 。