We introduce a new image segmentation task, termed Entity Segmentation (ES) with the aim to segment all visual entities in an image without considering semantic category labels. It has many practical applications in image manipulation/editing where the segmentation mask quality is typically crucial but category labels are less important. In this setting, all semantically-meaningful segments are equally treated as categoryless entities and there is no thing-stuff distinction. Based on our unified entity representation, we propose a center-based entity segmentation framework with two novel modules to improve mask quality. Experimentally, both our new task and framework demonstrate superior advantages as against existing work. In particular, ES enables the following: (1) merging multiple datasets to form a large training set without the need to resolve label conflicts; (2) any model trained on one dataset can generalize exceptionally well to other datasets with unseen domains. Our code is made publicly available at https://github.com/dvlab-research/Entity.
翻译:我们引入了一个新的图像分割任务,称为实体分割(ES), 目的是在不考虑语义分类标签的情况下将图像中的所有视觉实体分割成一个图像。 它在图像操纵/编辑方面有许多实际应用, 在图像分割面罩质量通常至关重要但分类标签不太重要的情况下, 它在图像操纵/编辑方面有许多实际应用。 在这种环境下, 所有具有语义意义的部分都被同等地作为无分类实体对待, 没有事物区分。 根据我们统一的实体代表, 我们提议了一个中心实体分割框架, 有两个新的模块来提高掩码质量。 实验性地, 我们的新任务和框架都显示了相对于现有工作的优势。 具体而言, ES 允许以下内容:(1) 将多个数据集合并成一个大型的训练组, 无需解决标签冲突 ; (2) 一个数据集培训的任何模型都可以非常出色地概括到其他隐蔽域的数据集 。 我们的代码可以在 https://github.com/dvlab-research/Entity上公开发布 。