Modern semantic segmentation methods devote much effect to adjusting image feature representations to improve the segmentation performance in various ways, such as architecture design, attention mechnism, etc. However, almost all those methods neglect the particularity of class weights (in the classification layer) in segmentation models. In this paper, we notice that the class weights of categories that tend to share many adjacent boundary pixels lack discrimination, thereby limiting the performance. We call this issue Boundary-caused Class Weights Confusion (BCWC). We try to focus on this problem and propose a novel method named Embedded Conditional Random Field (E-CRF) to alleviate it. E-CRF innovatively fuses the CRF into the CNN network as an organic whole for more effective end-to-end optimization. The reasons are two folds. It utilizes CRF to guide the message passing between pixels in high-level features to purify the feature representation of boundary pixels, with the help of inner pixels belonging to the same object. More importantly, it enables optimizing class weights from both scale and direction during backpropagation. We make detailed theoretical analysis to prove it. Besides, superpixel is integrated into E-CRF and served as an auxiliary to exploit the local object prior for more reliable message passing. Finally, our proposed method yields impressive results on ADE20K, Cityscapes, and Pascal Context datasets.
翻译:现代语义分解方法在调整图像特征表现方面产生了很大效果,以调整图像特征表现,从而以多种方式,如建筑设计、注意中枢等,改善分割性表现。 然而,几乎所有这些方法几乎都忽略了分化模型中(分类层)等级重量的特殊性。 在本文件中,我们注意到,倾向于分享许多相邻边界像素的类别重量缺乏区别,从而限制了性能。 我们称这一问题为“边界导致的等分级重量聚合(BOCCF) ” 。 我们试图关注这一问题,并提出一种名为“嵌入式定点随机场(E-CRF)” 的新颖方法,以缓解这一问题。 E-CRF 创新地将C 格式作为有机整体结合到CNN 网络中, 以便更有效地优化端对端对端的优化。 我们注意到, 使用通用报告格式来指导高层次像素之间的传递信息, 以清洁边界像素的特征代表, 以及同一对象的内部像素的帮助。 更重要的是, 它能够优化P级定序和后方程方向的类重量。 我们用一个理论分析, 将它作为最后的路径, 。