Weakly Supervised Object Localization (WSOL) aims to localize objects with image-level supervision. Existing works mainly rely on Class Activation Mapping (CAM) derived from a classification model. However, CAM-based methods usually focus on the most discriminative parts of an object (i.e., incomplete localization problem). In this paper, we empirically prove that this problem is associated with the mixup of the activation values between less discriminative foreground regions and the background. To address it, we propose Class RE-Activation Mapping (CREAM), a novel clustering-based approach to boost the activation values of the integral object regions. To this end, we introduce class-specific foreground and background context embeddings as cluster centroids. A CAM-guided momentum preservation strategy is developed to learn the context embeddings during training. At the inference stage, the re-activation mapping is formulated as a parameter estimation problem under Gaussian Mixture Model, which can be solved by deriving an unsupervised Expectation-Maximization based soft-clustering algorithm. By simply integrating CREAM into various WSOL approaches, our method significantly improves their performance. CREAM achieves the state-of-the-art performance on CUB, ILSVRC and OpenImages benchmark datasets. Code will be available at https://github.com/Jazzcharles/CREAM.
翻译:微弱监督对象本地化( WSOL ) 旨在将目标与图像级监管相匹配。 现有的工程主要依赖于从分类模型中产生的分类激活映射( CAM ) 。 但是, CAM 基础方法通常侧重于对象中最具歧视性的部分( 即不完全本地化问题 ) 。 在本文中, 我们从经验上证明, 这个问题与较不具有歧视性的地表地区和背景之间的激活值混在一起有关。 为了解决这个问题, 我们提议了类再激活映射( CREAM ), 这是一种基于集群的新型方法, 以提升集成目标区域中的激活值。 为此, 我们引入了类特定地表层和背景嵌入成集成的集成分集体中心。 CAM 指导了一种势头保护战略, 以学习培训期间的环境嵌入。 在感知阶段, 重新激活映射图是作为高斯基/ 混合模型下的一个参数估算问题。 为了解决这个问题, 我们可以通过将不受监控的期待的期待- 将软基团 CARC 改进 CREAL 方法, 实现 CAR- 的 CAMS 的运行 。