There have been many successful applications of sentence embedding methods. However, it has not been well understood what properties are captured in the resulting sentence embeddings depending on the supervision signals. In this paper, we focus on two types of sentence embedding methods with similar architectures and tasks: one fine-tunes pre-trained language models on the natural language inference task, and the other fine-tunes pre-trained language models on word prediction task from its definition sentence, and investigate their properties. Specifically, we compare their performances on semantic textual similarity (STS) tasks using STS datasets partitioned from two perspectives: 1) sentence source and 2) superficial similarity of the sentence pairs, and compare their performances on the downstream and probing tasks. Furthermore, we attempt to combine the two methods and demonstrate that combining the two methods yields substantially better performance than the respective methods on unsupervised STS tasks and downstream tasks.
翻译:多次成功地应用了判决嵌入方法,但是,人们并没有很好地理解由此而形成的判决嵌入的信号所捕捉的属性是什么。在本文中,我们侧重于两种类型的判决嵌入方法与类似的架构和任务:一种是自然语言推论任务方面的微调预培训语言模型,另一种是定义句中单词预测任务方面经过预先培训的语言模型,并调查其属性。具体地说,我们用从两个角度分割的STS数据集比较其在语义文本相似性(STS)任务上的性能:(1) 句源和(2) 句对的表面相似性,并比较其在下游和论证任务方面的性能。此外,我们试图将这两种方法结合起来,并表明这两种方法的性能大大优于未受监管的STS任务和下游任务的各自方法。