Existed pre-trained models have achieved state-of-the-art performance on various text classification tasks. These models have proven to be useful in learning universal language representations. However, the semantic discrepancy between similar texts cannot be effectively distinguished by advanced pre-trained models, which have a great influence on the performance of hard-to-distinguish classes. To address this problem, we propose a novel Contrastive Learning with Label Distance (CLLD) in this work. Inspired by recent advances in contrastive learning, we specifically design a classification method with label distance for learning contrastive classes. CLLD ensures the flexibility within the subtle differences that lead to different label assignments, and generates the distinct representations for each class having similarity simultaneously. Extensive experiments on public benchmarks and internal datasets demonstrate that our method improves the performance of pre-trained models on classification tasks. Importantly, our experiments suggest that the learned label distance relieve the adversarial nature of interclasses.
翻译:受过培训的模型在各种文本分类任务方面达到了最新水平,这些模型证明对学习通用语言表示方式很有用。但是,经过培训的先进模型无法有效地区分类似文本之间的语义差异,这些模型对难以区分的类别的表现有很大影响。为了解决这个问题,我们提议在这项工作中与Label距离(CLLD)进行新颖的对比学习。在对比学习的最新进展的启发下,我们专门设计了一种分类方法,为学习对比类设计标签距离。CLLD确保了在细微差异中的灵活性,导致不同的标签分配,同时为具有相似性的每个类别生成不同的表达方式。关于公共基准和内部数据集的广泛实验表明,我们的方法改善了经过培训的分类任务模型的性能。重要的是,我们的实验表明,学到的标签距离减轻了不同类之间的对抗性质。