Modern semantic segmentation methods devote much attention to adjusting feature representations to improve the segmentation performance in various ways, such as metric learning, architecture design, etc. However, almost all those methods neglect the particularity of boundary pixels. These pixels are prone to obtain confusing features from both sides due to the continuous expansion of receptive fields in CNN networks. In this way, they will mislead the model optimization direction and make the class weights of such categories that tend to share many adjacent pixels lack discrimination, which will damage the overall performance. In this work, we dive deep into this problem and propose a novel method named Embedded Superpixel CRF (ES-CRF) to address it. ES-CRF involves two main aspects. On the one hand, ES-CRF innovatively fuses the CRF mechanism into the CNN network as an organic whole for more effective end-to-end optimization. It utilizes CRF to guide the message passing between pixels in high-level features to purify the feature representation of boundary pixels, with the help of inner pixels belong to the same object. On the other hand, superpixel is integrated into ES-CRF to exploit the local object prior for more reliable message passing. Finally, our proposed method yields new records on two challenging benchmarks, i.e., Cityscapes and ADE20K. Moreover, we make detailed theoretical analysis to verify the superiority of ES-CRF.
翻译:现代语义分解方法非常关注调整地貌表示方式,以多种方式(如计量学习、建筑设计等)改善分化性能。然而,几乎所有这些方法都忽略了边界像素的特殊性。这些像素由于CNN网络中接收场的不断扩大,很容易从双方获得混淆性特征。这样,它们会误导模型优化方向,使这类类别中倾向于共享许多相邻像素的等级权重,从而损害总体性能。在这项工作中,我们深入探讨这一问题,并提出一种名为嵌入式超级像素通用报告格式(ES-CRF)的新方法来解决这个问题。ES-C格式涉及两个主要方面。一方面,ES-CRF创新地将通用报告格式机制作为有机整体纳入CNN网络,以便更有效地实现端对端优化。它利用通用报告格式来指导高层次特征中像素之间的传递信息,以净化边界像素的特征代表,而内部象素的帮助属于同一对象。在另一方面,ESC的内嵌像素是内部像素的帮助。ES-C-C-C-C-Suprecialalalalal realal reduction reductionalal redufal reducal reviducal reducal reduflation 。最后,我们使用了ES-chal reviewviewview 和ES-chal reviducoldududucal remax