Self-supervised monocular depth estimation (MDE) models universally suffer from the notorious edge-fattening issue. Triplet loss, as a widespread metric learning strategy, has largely succeeded in many computer vision applications. In this paper, we redesign the patch-based triplet loss in MDE to alleviate the ubiquitous edge-fattening issue. We show two drawbacks of the raw triplet loss in MDE and demonstrate our problem-driven redesigns. First, we present a min. operator based strategy applied to all negative samples, to prevent well-performing negatives sheltering the error of edge-fattening negatives. Second, we split the anchor-positive distance and anchor-negative distance from within the original triplet, which directly optimizes the positives without any mutual effect with the negatives. Extensive experiments show the combination of these two small redesigns can achieve unprecedented results: Our powerful and versatile triplet loss not only makes our model outperform all previous SoTA by a large margin, but also provides substantial performance boosts to a large number of existing models, while introducing no extra inference computation at all.
翻译:自我监督的单眼深度估计模式普遍受到臭名昭著的边缘脂肪问题的影响。 作为一项广泛的衡量学习策略,三联体损失在计算机的很多应用中基本上取得了成功。 在本文中,我们重新设计了MDE中基于四重基的三重损失,以缓解无处不在的边缘脂肪问题。 我们展示了MDE中原始三重损失的两个缺点,并展示了我们的问题驱动重新设计。 首先,我们展示了对所有负面样本应用的以分钟为基础的操作者战略,以防止以出色表现的负差来掩盖边缘脂肪负差的错误。 其次,我们将锚-阳性距离和锚-偏差距离与最初的三重线内部分割开来,直接优化正差,而没有产生任何共同效果。 广泛的实验表明,这两种小型重新设计的组合可以取得前所未有的结果: 我们的强力和多面三重损失不仅使我们的模型大大超出先前的SETA,而且还为大量现有模型提供了实质性的性推力推进力,同时没有引入任何额外的计算。