This paper presents Contrastive Reconstruction, ConRec - a self-supervised learning algorithm that obtains image representations by jointly optimizing a contrastive and a self-reconstruction loss. We showcase that state-of-the-art contrastive learning methods (e.g. SimCLR) have shortcomings to capture fine-grained visual features in their representations. ConRec extends the SimCLR framework by adding (1) a self-reconstruction task and (2) an attention mechanism within the contrastive learning task. This is accomplished by applying a simple encoder-decoder architecture with two heads. We show that both extensions contribute towards an improved vector representation for images with fine-grained visual features. Combining those concepts, ConRec outperforms SimCLR and SimCLR with Attention-Pooling on fine-grained classification datasets.
翻译:本文介绍反向重建, ConRec -- -- 一种自我监督的学习算法,它通过共同优化对比和自我重建损失来获得图像表达方式,我们展示了最先进的对比学习方法(例如SimCLR)在其表达方式中捕捉细微视觉特征的缺点。 ConRec 扩展了SimCLR框架,增加了(1) 自重建任务和(2) 对比学习任务中的关注机制。这是通过应用一个带有两个头的简单的编码解码器结构来实现的。我们显示,两种扩展都有助于改善具有精细视觉特征的图像的矢量代表方式。将这些概念、ConRecforps SimCLR和SimCLR结合起来,在精细重分类数据集上加上注意-Pooling。