We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: https://github.com/facebookresearch/Detectron
翻译:我们提出了一个概念简单、灵活和通用的天体分解框架。 我们的方法在图像中有效检测物体,同时为每个实例生成高质量的分解面罩。 这个方法叫做Mask R-CNN, 扩展了更快R-CNN, 增加了一个分支, 以预测物体面罩, 与现有分支平行进行捆绑箱识别; R-CNN 简单易培训, 仅给更快的 R-CNN 增加一个小顶部, 运行在 5 英尺处。 此外, Mask R- CNN 很容易推广到其他任务中, 例如, 允许我们在同一框架内估计人姿势。 我们展示了COCO 系列挑战的所有三个轨道的顶部结果, 包括实例分解、 捆绑箱对象检测和人键点检测。 没有钟和哨子, Mask R-CNN 超越了每一项任务上所有现有的单一模型条目, 包括COCO 2016 挑战赢家。 我们希望我们简单有效的方法能够作为坚实的基线, 有助于未来在实例层面的研究识别。 代码已经发布在 https://gregregreforma/ reforonoronoronoron。