High-quality instance segmentation has shown emerging importance in computer vision. Without any refinement, DCT-Mask directly generates high-resolution masks by compressed vectors. To further refine masks obtained by compressed vectors, we propose for the first time a compressed vector based multi-stage refinement framework. However, the vanilla combination does not bring significant gains, because changes in some elements of the DCT vector will affect the prediction of the entire mask. Thus, we propose a simple and novel method named PatchDCT, which separates the mask decoded from a DCT vector into several patches and refines each patch by the designed classifier and regressor. Specifically, the classifier is used to distinguish mixed patches from all patches, and to correct previously mispredicted foreground and background patches. In contrast, the regressor is used for DCT vector prediction of mixed patches, further refining the segmentation quality at boundary locations. Experiments on COCO show that our method achieves 2.0%, 3.2%, 4.5% AP and 3.4%, 5.3%, 7.0% Boundary AP improvements over Mask-RCNN on COCO, LVIS, and Cityscapes, respectively. It also surpasses DCT-Mask by 0.7%, 1.1%, 1.3% AP and 0.9%, 1.7%, 4.2% Boundary AP on COCO, LVIS and Cityscapes. Besides, the performance of PatchDCT is also competitive with other state-of-the-art methods.
翻译:在计算机视野中,高品质的试样分解显示在计算机视野中已显露出重要性。 DCT- Mask 直接通过压缩矢量生成高分辨率面罩,而没有经过任何改进,DCT-Mask 直接生成压缩矢量获得的高分辨率面罩。为了进一步改进压缩矢量获得的面罩,我们首次提议了一个基于压缩矢量的多阶段完善框架。然而,香草组合并没有带来重大收益,因为DCT矢量的某些元素的变化将影响整个遮罩的预测。因此,我们提议了一个简单和新颖的方法,即PatchDCT,将DCT矢量从DCT矢量解码分为几个补丁,由设计分类器和递归者改进每个补补。具体来说,为了进一步细化压缩矢量,我们使用分类器来区分所有压缩矢量的混合面罩面罩,并纠正先前错误的地表和背景补补丁。相比之下,对DCT矢量的矢量预测将使用递归为DCT的矢量值,对边界值为2.0%、3.2%、4.5 AP和3.4%、5.0% AP-Mas-NNNE、LVIS-RBS-BS-BS-BS-BS-BS-BS、1.-BS-BS、1.BS、1.%L-BS-BS-BS-BS-R-R-BS-BS-BS-BS-BS-BS-R-R-BS-BS-BS-BS-BS-BS-BS-BS-BS-C-BS-BS-BS-BS-BS-BS-C-BS-BS-R-C-R-BS-C-C-B-B-B-B-R-R-R-R-R-R-R-R-B-RV-BS-BS-C-RV-V-C-V-R-R-R-R-R-R-R-R-R-R-R-R-V-V-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-