Recent literature has shown that features obtained from supervised training of CNNs may over-emphasize texture rather than encoding high-level information. In self-supervised learning in particular, texture as a low-level cue may provide shortcuts that prevent the network from learning higher level representations. To address these problems we propose to use classic methods based on anisotropic diffusion to augment training using images with suppressed texture. This simple method helps retain important edge information and suppress texture at the same time. We empirically show that our method achieves state-of-the-art results on object detection and image classification with eight diverse datasets in either supervised or self-supervised learning tasks such as MoCoV2 and Jigsaw. Our method is particularly effective for transfer learning tasks and we observed improved performance on five standard transfer learning datasets. The large improvements (up to 11.49\%) on the Sketch-ImageNet dataset, DTD dataset and additional visual analyses with saliency maps suggest that our approach helps in learning better representations that better transfer.
翻译:最近的文献表明,从有线电视新闻网的监督培训中获得的特征可能过分强调纹理,而不是对高层次信息进行编码。在自我监督的学习中,特别是作为低层次提示的纹理可能会提供一些捷径,使网络无法学习更高层次的表述。为了解决这些问题,我们提议使用基于厌食传播的经典方法,利用抑制纹理的图像加强培训。这一简单方法有助于保留重要的边际信息,同时抑制纹理。我们的经验显示,我们的方法在物体探测和图像分类方面取得了最新的最新结果,有8个不同的数据集在监督或自我监督的学习任务中,如MOCOV2和Jigsaw。我们的方法对于转移学习任务特别有效,我们观察到了5个标准传输学习数据集的改进表现。在Scetch-ImageNet数据集、DTD数据集和以突出地图进行的额外视觉分析方面(最多达11.49 ⁇ )的大规模改进表明,我们的方法有助于更好地学习更好的转换。