Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years. However, their accuracy and completeness are still struggling. In this paper, we propose a new method to enhance the performance of existing networks inspired by contrastive learning and feature matching. First, we propose a Contrast Matching Loss (CML), which treats the correct matching points in depth-dimension as positive sample and other points as negative samples, and computes the contrastive loss based on the similarity of features. We further propose a Weighted Focal Loss (WFL) for better classification capability, which weakens the contribution of low-confidence pixels in unimportant areas to the loss according to predicted confidence. Extensive experiments performed on DTU, Tanks and Temples and BlendedMVS datasets show our method achieves state-of-the-art performance and significant improvement over baseline network.
翻译:近年来,基于学习的多视立体(MVS)方法取得了令人印象深刻的进展,超越了传统方法,然而,其准确性和完整性仍在挣扎之中。在本文件中,我们提出了一种新方法来提高现有网络的性能,这是通过对比性学习和特征匹配而激发的。首先,我们提出了一种对比性匹配损失(CML),将深度差异中的正确匹配点作为正样处理,将其他点作为负样处理,并根据特征相似性计算出对比性损失。我们进一步提议了一种加权焦点损失(WFL),以提高分类能力,从而根据预期的信心,削弱非重要地区的低信任像素对损失的贡献。在DTU、坦克和寺庙以及BlendiveMVS数据集上进行的广泛实验表明,我们的方法取得了最新业绩,并大大改进了基线网络。