We present a new method for efficient high-quality image segmentation of objects and scenes. By analogizing classical computer graphics methods for efficient rendering with over- and undersampling challenges faced in pixel labeling tasks, we develop a unique perspective of image segmentation as a rendering problem. From this vantage, we present the PointRend (Point-based Rendering) neural network module: a module that performs point-based segmentation predictions at adaptively selected locations based on an iterative subdivision algorithm. PointRend can be flexibly applied to both instance and semantic segmentation tasks by building on top of existing state-of-the-art models. While many concrete implementations of the general idea are possible, we show that a simple design already achieves excellent results. Qualitatively, PointRend outputs crisp object boundaries in regions that are over-smoothed by previous methods. Quantitatively, PointRend yields significant gains on COCO and Cityscapes, for both instance and semantic segmentation. PointRend's efficiency enables output resolutions that are otherwise impractical in terms of memory or computation compared to existing approaches. Code has been made available at https://github.com/facebookresearch/detectron2/tree/master/projects/PointRend.
翻译:我们提出了一个高效高品质对象和场景图像分割的新方法。 通过模拟古典计算机图形图形方法, 以像素标签任务中面临的超值和低度抽样挑战来有效转换像素标签任务, 我们开发了一个独特的图像分割观点, 是一个制造问题。 我们从这个模型中, 展示了点Rend( 基于点的导出) 神经网络模块: 一个模块, 以迭接的亚形算法为基础, 对适应性选定地点进行点分解预测 。 点Rend 可以在现有的最先进的模型之上, 灵活地应用到例和语义分割任务。 虽然许多具体实施一般想法是可能的, 但我们显示一个简单设计已经取得了极好的结果。 从质量上看, 点Rend 输出点对被先前方法覆盖过的区域的物体边界进行了精确的校正。 定量上, 点Rend 在CO和城景中, 既为实例,也为语义分分分法, 。 点Rend 效率使输出分辨率的分辨率决议在记忆/ / 方向上不切实际/, 与现有 数据库 比较 。