Contrastive losses have long been a key ingredient of deep metric learning and are now becoming more popular due to the success of self-supervised learning. Recent research has shown the benefit of decomposing such losses into two sub-losses which act in a complementary way when learning the representation network: a positive term and an entropy term. Although the overall loss is thus defined as a combination of two terms, the balance of these two terms is often hidden behind implementation details and is largely ignored and sub-optimal in practice. In this work, we approach the balance of contrastive losses as a hyper-parameter optimization problem, and propose a coordinate descent-based search method that efficiently find the hyper-parameters that optimize evaluation performance. In the process, we extend existing balance analyses to the contrastive margin loss, include batch size in the balance, and explain how to aggregate loss elements from the batch to maintain near-optimal performance over a larger range of batch sizes. Extensive experiments with benchmarks from deep metric learning and self-supervised learning show that optimal hyper-parameters are found faster with our method than with other common search methods.
翻译:最近的研究显示,将这种损失分解为两个次损失,在学习代表性网络时可以相互补充地发挥作用:一个是正值,另一个是英特朗。虽然总体损失被定义为两个术语的组合,但这两个术语的其余部分往往被隐藏在执行细节后面,在实际中大多被忽略和不太理想。在这项工作中,我们把对比性损失的平衡视为超参数优化问题,提出一种协调的基于世系的搜索方法,以便有效地找到最优化评价业绩的超参数。在此过程中,我们将现有的平衡分析扩展至对比性差值损失,包括平衡的批量大小,并解释如何将批量损失部分综合起来,以保持较大批量规模的近优度性性能。从深度学习和自我超强的学习基准中进行的广泛实验表明,我们的方法发现最佳的超参数比其他通用搜索方法要快。