Sentence embeddings encode sentences in fixed dense vectors and have played an important role in various NLP tasks and systems. Methods for building sentence embeddings include unsupervised learning such as Quick-Thoughts and supervised learning such as InferSent. With the success of pretrained NLP models, recent research shows that fine-tuning pretrained BERT on SNLI and Multi-NLI data creates state-of-the-art sentence embeddings, outperforming previous sentence embeddings methods on various evaluation benchmarks. In this paper, we propose a new method to build sentence embeddings by doing supervised contrastive learning. Specifically our method fine-tunes pretrained BERT on SNLI data, incorporating both supervised crossentropy loss and supervised contrastive loss. Compared with baseline where fine-tuning is only done with supervised cross-entropy loss similar to current state-of-the-art method SBERT, our supervised contrastive method improves 2.8% in average on Semantic Textual Similarity (STS) benchmarks and 1.05% in average on various sentence transfer tasks.
翻译:在固定密度矢量和多NLI数据中,通过预先培训的NLP模型的成功,最近的研究表明,在SNLI和多NLI数据中,对经过预先培训的BERT进行微调,形成最先进的嵌入,优于以往的嵌入方法,在各种评价基准中。在本文中,我们建议了一种新的方法,通过监督对比学习来建立嵌入句子。具体地说,我们的方法是预先培训的BERT在SNLI数据上,既包括监督的跨热带损失,也包括监督的对比损失。与基准相比,我们监督的对比方法只对监督的跨热带损失进行了微调,类似于目前的最新方法SBERT,我们监督的跨热带损失方法提高了Smantic Textality基准的平均2.8%和各种判决转移任务的平均1.05%。