Manually annotating object segmentation masks is very time-consuming. While interactive segmentation methods offer a more efficient alternative, they become unaffordable at a large scale because the cost grows linearly with the number of annotated masks. In this paper, we propose a highly efficient annotation scheme for building large datasets with object segmentation masks. At a large scale, images contain many object instances with similar appearance. We exploit these similarities by using hierarchical clustering on mask predictions made by a segmentation model. We propose a scheme that efficiently searches through the hierarchy of clusters and selects which clusters to annotate. Humans manually verify only a few masks per cluster, and the labels are propagated to the whole cluster. Through a large-scale experiment to populate 1M unlabeled images with object segmentation masks for 80 object classes, we show that (1) we obtain 1M object segmentation masks with an total annotation time of only 290 hours; (2) we reduce annotation time by 76x compared to manual annotation; (3) the segmentation quality of our masks is on par with those from manually annotated datasets. Code, data, and models are available online.
翻译:手工说明物体分割面罩非常耗时。 虽然互动分割面罩提供了一种效率更高的替代方法, 但由于成本随着附加说明的面罩数量增长而线性增长, 互动分割面罩变得非常昂贵。 在本文中, 我们提出一个高效的批注计划, 用于用物体分割面罩建立大型数据集。 大规模图像包含许多类似外观的物体。 我们利用这些相似之处, 使用分块模型对遮罩预测进行等级分组; 我们提议一个计划, 高效地搜索组群的等级, 并选择组群到批中进行批注。 人类手动核查每个组组中只有几个面罩, 并将标签传播到整个组群中。 通过大规模实验, 将一个带有物体分割面罩的1M无标签图像在80个物体类别中, 我们显示:(1) 我们获得1M物体分割面罩, 总共只有290小时的注解时间; (2) 我们比人工注时减少76x; (3) 我们的面罩的分解质量与手动注释式数据集的相近。 代码、 数据和模型是在线提供的。