Incorporating contrastive learning objectives in sentence representation learning (SRL) has yielded significant improvements on many sentence-level NLP tasks. However, It is not well understood why contrastive learning works for learning sentence-level semantics. In this paper, we take a closer look at contrastive sentence representation learning through the lens of isotropy and learning dynamics. We interpret its success stories through the geometry of the representation shifts. We show that contrastive learning brings isotropy, and surprisingly learns to converge tokens to similar positions in the semantic space if given the signal that they are in the same sentence. Also, what we formalize as "spurious contextualization" is mitigated for semantically meaningful tokens, while augmented for functional ones. The embedding space is pushed toward the origin during training, with more areas now better defined. We ablate these findings by observing the learning dynamic with different training temperatures, batch sizes and pooling methods. With these findings, we aim to shed light on future designs of sentence representation learning methods.
翻译:将对比性学习目标纳入刑罚代表制学习(SRL)已经在许多刑期级别NLP任务上取得了显著的改进。 但是,人们并不清楚为什么对比性学习有助于学习判刑级别的语义学。 在本文中,我们更仔细地研究对比性判决代表制,通过异性与学习动态的透镜和学习动态进行学习。我们通过代表制变化的几何来解释其成功故事。我们显示,对比性学习带来异性,并令人惊讶地发现,如果有相同句子的信号,就会把符号聚集到语义空间的类似位置上。此外,我们正式确定为“纯性背景化”的符号会减少,而功能符号则会增加。嵌入的空间会推向培训的起源,现在有更多的领域会更好界定。我们用不同的培训温度、批量大小和集合方法观察学习动态,从而消除这些发现。我们的目标是要说明如何设计未来的刑罚代表学习方法。