The poor performance of the original BERT for sentence semantic similarity has been widely discussed in previous works. We find that unsatisfactory performance is mainly due to the static token embeddings biases and the ineffective BERT layers, rather than the high cosine similarity of the sentence embeddings. To this end, we propose a prompt based sentence embeddings method which can reduce token embeddings biases and make the original BERT layers more effective. By reformulating the sentence embeddings task as the fillin-the-blanks problem, our method significantly improves the performance of original BERT. We discuss two prompt representing methods and three prompt searching methods for prompt based sentence embeddings. Moreover, we propose a novel unsupervised training objective by the technology of template denoising, which substantially shortens the performance gap between the supervised and unsupervised setting. For experiments, we evaluate our method on both non fine-tuned and fine-tuned settings. Even a non fine-tuned method can outperform the fine-tuned methods like unsupervised ConSERT on STS tasks. Our fine-tuned method outperforms the state-of-the-art method SimCSE in both unsupervised and supervised settings. Compared to SimCSE, we achieve 2.29 and 2.58 points improvements on BERT and RoBERTa respectively under the unsupervised setting.
翻译:我们发现,绩效不尽如人意的主要原因是静态象征性嵌入偏差和不起作用的BERT层,而不是句内嵌入的高度正弦相似性。为此,我们建议采用基于快速句内嵌方法,可以减少符号嵌入偏差,并使原BERT层更加有效。通过将句内嵌任务作为填字板问题进行重新配置,我们的方法大大改进了原BERT的性能。我们讨论了两种即时代表方法和三种快速搜索方法,以迅速嵌入判决。此外,我们建议采用一个不受监督的新的培训目标,即用模板拆落技术,大大缩短受监督和不受监督的设置之间的性能差距。关于实验,我们评估我们关于未微调和微调的设置的方法。即使不完善的方法也能够超越了调整后的方法,如STS-STRT任务的未经监督的配置。我们未调整的方法在SIMS-258和SIMSER的设置下分别超越了我们不受监督的SIM-SER系统和SIM-SUB-BSU-B-SIMS-BSUB-B-B-BSB-SUB-BSBSER-SB-S-SUD-SUD-S-SUBSUB-SUD-SUD-SUD-SUB-S-SUB-SUB-SB-SB-S-S-S-SUB-S-S-S-S-S-S-S-S-S-S-S-S-S-SB-SB-SB-SB-SB-SB-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SB-SB-SB-SB-SB-SB-SB-S-S-S-SB-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S