We propose point-based instance-level annotation, a new form of weak supervision for instance segmentation. It combines the standard bounding box annotation with labeled points that are uniformly sampled inside each bounding box. We show that the existing instance segmentation models developed for full mask supervision, like Mask R-CNN, can be seamlessly trained with the point-based annotation without any major modifications. In our experiments, Mask R-CNN models trained on COCO, PASCAL VOC, Cityscapes, and LVIS with only 10 annotated points per object achieve 94%--98% of their fully-supervised performance. The new point-based annotation is approximately 5 times faster to collect than object masks, making high-quality instance segmentation more accessible for new data. Inspired by the new annotation form, we propose a modification to PointRend instance segmentation module. For each object, the new architecture, called Implicit PointRend, generates parameters for a function that makes the final point-level mask prediction. Implicit PointRend is more straightforward and uses a single point-level mask loss. Our experiments show that the new module is more suitable for the proposed point-based supervision.
翻译:我们提出基于点的例中点注释,这是一种新的微弱监管形式,例如分化。它把标准约束框注释与标签式的点数结合起来,每个框内统一抽样;我们显示,为完全遮罩监督而开发的现有例中点分解模型,如Mask R-CNN,可以在不作任何重大修改的情况下,用基于点的批注进行无缝的培训。在我们的实验中,在COCO、PaSCAL VOC、Cityscaps和LVIS上培训的R-CNN模型中,每个对象只有10个注解点达到其完全监督性能的94-98%。基于点的注解比对象面罩收集大约快5倍,使高质量的例分解更容易获得新数据。根据新的注解注格式,我们建议修改PointRend例分解模块。对于每个对象,新结构称为imcolent Point Rend,为最终点面值预测的函数生成参数。隐含点Rend是更直接的,并且使用一个基于点的单点的模块。我们实验显示适合的标签损失。