对抗学习的相似性度量——ContraSim (ContraSim -- A Similarity Measure Based on Contrastive Learning)

Recent work has compared neural network representations via similarity-based analyses, shedding light on how different aspects (architecture, training data, etc.) affect models' internal representations. The quality of a similarity measure is typically evaluated by its success in assigning a high score to representations that are expected to be matched. However, existing similarity measures perform mediocrely on standard benchmarks. In this work, we develop a new similarity measure, dubbed ContraSim, based on contrastive learning. In contrast to common closed-form similarity measures, ContraSim learns a parameterized measure by using both similar and dissimilar examples. We perform an extensive experimental evaluation of our method, with both language and vision models, on the standard layer prediction benchmark and two new benchmarks that we introduce: the multilingual benchmark and the image-caption benchmark. In all cases, ContraSim achieves much higher accuracy than previous similarity measures, even when presented with challenging examples, and reveals new insights not captured by previous measures.

翻译：最近的研究通过基于相似度的分析比较了神经网络表示法，揭示了不同方面（体系结构、训练数据等）如何影响模型的内部表示。相似性度量的质量通常通过其在分配应匹配的表示时的成功来评估。然而，现有的相似性度量在标准基准测试中表现平平。在本研究中，我们基于对比学习开发了一种新的相似性度量，称为ContraSim。与常见的闭式相似性度量不同，ContraSim通过使用相似和不相似的示例学习参数化的度量。我们对我们的方法进行了广泛的实验评估，包括语言和视觉模型，对标准的层预测基准测试和我们介绍的两个新基准测试进行了评估：多语言基准测试和图像字幕基准测试。在所有情况下，ContraSim的准确率都远高于先前的相似性度量，即使在出现挑战性的例子时也是如此，并揭示了之前未捕捉到的新见解。

相关内容

相似性度量

关注 6

相似性度量，即综合评定两个事物之间相近程度的一种度量。两个事物越接近，它们的相似性度量也就越大，而两个事物越疏远，它们的相似性度量也就越小。相似性度量的给法种类繁多，一般根据实际问题进行选用。常用的相似性度是有：相关系数(衡量变量之间接近程度)，相似系数(衡量样品之间接近程度)，若样品给出的是定性数据，这时衡量样品之间接近程度，可用样本的匹配系数、一致度等。

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日