This paper presents an analysis regarding an influence of the Distance Metric Learning (DML) loss functions on the supervised fine-tuning of the language models for classification tasks. We experimented with known datasets from SentEval Transfer Tasks. Our experiments show that applying the DML loss function can increase performance on downstream classification tasks of RoBERTa-large models in few-shot scenarios. Models fine-tuned with the use of SoftTriple loss can achieve better results than models with a standard categorical cross-entropy loss function by about 2.89 percentage points from 0.04 to 13.48 percentage points depending on the training dataset. Additionally, we accomplished a comprehensive analysis with explainability techniques to assess the models' reliability and explain their results.
翻译:本文分析了远程计量学习损失功能对分类任务语言模式监督微调的影响。我们实验了SentEval传输任务中已知的数据集。我们的实验表明,应用DML损失功能可以在几发假想中提高RoBERTA大模型下游分类任务的绩效。与SoftTriple损失有关的微调模型比标准截然跨热带损失功能模型取得更好的效果,从0.04到13.48%的约2.89个百分点,这取决于培训数据集。此外,我们完成了全面分析,采用了解释性技术来评估模型的可靠性并解释其结果。