Weakly supervised segmentation methods using bounding box annotations focus on obtaining a pixel-level mask from each box containing an object. Existing methods typically depend on a class-agnostic mask generator, which operates on the low-level information intrinsic to an image. In this work, we utilize higher-level information from the behavior of a trained object detector, by seeking the smallest areas of the image from which the object detector produces almost the same result as it does from the whole image. These areas constitute a bounding-box attribution map (BBAM), which identifies the target object in its bounding box and thus serves as pseudo ground-truth for weakly supervised semantic and instance segmentation. This approach significantly outperforms recent comparable techniques on both the PASCAL VOC and MS COCO benchmarks in weakly supervised semantic and instance segmentation. In addition, we provide a detailed analysis of our method, offering deeper insight into the behavior of the BBAM.
翻译:使用捆绑框说明的薄弱监督分解方法侧重于从每个装有物体的框中获取像素级遮罩。 现有方法通常依赖于一个等级的不可知面罩生成器, 该生成器以图像所固有的低层次信息操作。 在这项工作中,我们使用来自受过训练的物体探测器行为的更高层次的信息, 搜索物体探测器产生与整个图像几乎相同结果的最小图像区域。 这些区域构成一个捆绑框归属图( BBAM ), 该图在捆绑框中识别目标对象, 从而作为薄弱监管的语义和实例分解的假地真象。 这种方法大大优于在微弱监管的语义和实例分解中, PACAL VOC 和 MS COCO 基准的最近可比技术。 此外, 我们详细分析了我们的方法, 更深入地了解了 BBAM 的行为。