Instance segmentation is one of the fundamental vision tasks. Recently, fully convolutional instance segmentation methods have drawn much attention as they are often simpler and more efficient than two-stage approaches like Mask R-CNN. To date, almost all such approaches fall behind the two-stage Mask R-CNN method in mask precision when models have similar computation complexity, leaving great room for improvement. In this work, we achieve improved mask prediction by effectively combining instance-level information with semantic information with lower-level fine-granularity. Our main contribution is a blender module which draws inspiration from both top-down and bottom-up instance segmentation approaches. The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference. BlendMask can be easily incorporated with the state-of-the-art one-stage detection frameworks and outperforms Mask R-CNN under the same training schedule while being 20% faster. A light-weight version of BlendMask achieves $ 34.2% $ mAP at 25 FPS evaluated on a single 1080Ti GPU card. Because of its simplicity and efficacy, we hope that our BlendMask could serve as a simple yet strong baseline for a wide range of instance-wise prediction tasks.
翻译:最近,全面变换式分解方法比Mask R-CNN 等两阶段方法更简单、效率更高。 至今,几乎所有这类方法都落后于两阶段的Mask R-CNN 方法,当模型具有类似的计算复杂度时,在掩码精度方面几乎都落后于两阶段的Mask R-CNN 方法,从而留下很大的改进空间。 在这项工作中,我们通过有效地将实例级信息与语义信息与较低级微调级微调信息相结合,实现改进遮罩预测。我们的主要贡献是一个混合器模块,从上至下和下至下两种分解方法中吸引灵感。提议的BlendMask 能够以很少的渠道有效预测密度的每类像素位置敏感实例特征,并学习每种情况下仅使用一个变异层的注意图,从而可以很快地进行改进。 BlendMask 能够很容易地与最先进的一阶段检测框架结合,在相同的培训时间表下超越Myma R-CN 系统,同时速度为20 % 。 BlenMask 的轻质版本,Blenal-alal-Mask as-laimalal laimal a slax a sillyal slaim laus a lax a asy laimal sal sal sal sal sal sal laim laim lavial a sal a sal a sal laview lapal sal laim laimal lax lax a sal a sal a sal lady sal sal laft laim lax a lax sal lax a lax a lax a sal a sal sal sal sal a sal a sal a sal sal a sal ax ax a sal lad pal be sal lapal sal sal lapal lavial latial ladal ax ax a lad pal sal sal sal sal sal sal lad lad lad ladal a ladal sal sal sal a lax a sal a sal a sal a