Panoptic segmentation (PS) is a complex scene understanding task that requires providing high-quality segmentation for both thing objects and stuff regions. Previous methods handle these two classes with semantic and instance segmentation modules separately, following with heuristic fusion or additional modules to resolve the conflicts between the two outputs. This work simplifies this pipeline of PS by consistently modeling the two classes with a novel PS framework, which extends a detection model with an extra module to predict category- and instance-aware pixel embedding (CIAE). CIAE is a novel pixel-wise embedding feature that encodes both semantic-classification and instance-distinction information. At the inference process, PS results are simply derived by assigning each pixel to a detected instance or a stuff class according to the learned embedding. Our method not only demonstrates fast inference speed but also the first one-stage method to achieve comparable performance to two-stage methods on the challenging COCO benchmark.
翻译:光学分离( PS) 是一项复杂的场景理解任务, 需要为物体和物质区域提供高质量的分解。 先前的方法分别处理这两个类别, 包括语义分解模块和实例分解模块, 之后是超常聚合模块或额外的模块, 以解决两个输出之间的冲突 。 这项工作通过以新的 PS 框架对这两个类别进行一致建模, 从而简化 PS 管道, 以新的 PS 框架为模式, 扩展一个检测模型, 并增加一个模块, 以预测类别和实例识别像素嵌入( CIAE ) 。 CIAE 是一个新颖的像素嵌入功能, 将语义分解和实例分辨信息编码。 在推断过程中, PS 的结果只是通过将每个像素指定给一个检测到的像素或根据所学的嵌入物质类别来得出 。 我们的方法不仅显示快速的推导速度, 而且还显示第一个实现具有挑战性的COCO基准的两阶段性能的一阶段方法。