VIRT:通过虚拟互动改进基于代表制的文本匹配模式 (VIRT: Improving Representation-based Models for Text Matching through Virtual Interaction)

With the booming of pre-trained transformers, remarkable progress has been made on textual pair modeling to support relevant natural language applications. Two lines of approaches are developed for text matching: interaction-based models performing full interactions over the textual pair, and representation-based models encoding the pair independently with siamese encoders. The former achieves compelling performance due to its deep interaction modeling ability, yet with a sacrifice in inference latency. The latter is efficient and widely adopted for practical use, however, suffers from severe performance degradation due to the lack of interactions. Though some prior works attempt to integrate interactive knowledge into representation-based models, considering the computational cost, they only perform late interaction or knowledge transferring at the top layers. Interactive information in the lower layers is still missing, which limits the performance of representation-based solutions. To remedy this, we propose a novel \textit{Virtual} InteRacTion mechanism, termed as VIRT, to enable full and deep interaction modeling in representation-based models without \textit{actual} inference computations. Concretely, VIRT asks representation-based encoders to conduct virtual interactions to mimic the behaviors as interaction-based models do. In addition, the knowledge distilled from interaction-based encoders is taken as supervised signals to promise the effectiveness of virtual interactions. Since virtual interactions only happen at the training stage, VIRT would not increase the inference cost. Furthermore, we design a VIRT-adapted late interaction strategy to fully utilize the learned virtual interactive knowledge.

翻译：随着经过培训的变压器的兴起,在文本配对模式方面取得了显著进展,以支持相关的自然语言应用。为文本匹配制定了两行方法:基于互动的模型,在文本配对中进行充分互动,基于代表的模型独立地将配对与 Siamese 编码器编码。前者由于其深度互动模型能力而取得了令人信服的性能,但又在推论期中牺牲了隐含的潜伏力,后者由于缺乏互动而有效和被广泛采用,在实际使用方面出现了严重的性能退化。虽然以前的一些工作试图将交互式知识纳入基于代表性的模式,考虑到计算成本,但它们只是在顶层进行晚期互动或知识转让。低层的交互式信息仍然缺失,这限制了基于代表性的解决方案的性能。为了纠正这一点,我们建议采用新的“Textitit{Virtualth} InteRacTion机制,称为“VIRT”,以便完全和深入地建模基于代表性的模型的模拟,而无需/text{condical} 度计算。具体地说,VIRT VIRT Exde-demode-de-de-de-demode-de-deactactactactactactactactactivactation der-deactactivactations