Dual encoders have been used for question-answering (QA) and information retrieval (IR) tasks with good results. Previous research focuses on two major types of dual encoders, Siamese Dual Encoder (SDE), with parameters shared across two encoders, and Asymmetric Dual Encoder (ADE), with two distinctly parameterized encoders. In this work, we explore different ways in which the dual encoder can be structured, and evaluate how these differences can affect their efficacy in terms of QA retrieval tasks. By evaluating on MS MARCO, open domain NQ and the MultiReQA benchmarks, we show that SDE performs significantly better than ADE. We further propose three different improved versions of ADEs by sharing or freezing parts of the architectures between two encoder towers. We find that sharing parameters in projection layers would enable ADEs to perform competitively with or outperform SDEs. We further explore and explain why parameter sharing in projection layer significantly improves the efficacy of the dual encoders, by directly probing the embedding spaces of the two encoder towers with t-SNE algorithm.
翻译:在这项工作中,我们探索了双重编码器结构上的不同方法,并评估了这些差异如何影响其QA检索任务的效率。我们通过对MS MARCO、开放域域NQ和多雷数A基准进行评价,表明SDE的表现比ADE要好得多。我们进一步建议了三种不同的经改进的ADE版本,方法是在两个编码器塔之间共享或冻结结构部分。我们发现,在投影层中共享参数将使ADE能够以竞争性方式与SDE进行或超出SDE系统进行竞争。我们进一步探讨并解释为什么在投影层中共享参数会大大提高双重编码器的功效,方法是直接探测两个编码器塔的嵌入空间与S-NE的算法。