Deep learning based pipelines for semantic segmentation often ignore structural information available on annotated images used for training. We propose a novel post-processing module enforcing structural knowledge about the objects of interest to improve segmentation results provided by deep learning. This module corresponds to a "many-to-one-or-none" inexact graph matching approach, and is formulated as a quadratic assignment problem. Our approach is compared to a CNN-based segmentation (for various CNN backbones) on two public datasets, one for face segmentation from 2D RGB images (FASSEG), and the other for brain segmentation from 3D MRIs (IBSR). Evaluations are performed using two types of structural information (distances and directional relations, , this choice being a hyper-parameter of our generic framework). On FASSEG data, results show that our module improves accuracy of the CNN by about 6.3% (the Hausdorff distance decreases from 22.11 to 20.71). On IBSR data, the improvement is of 51% (the Hausdorff distance decreases from 11.01 to 5.4). In addition, our approach is shown to be resilient to small training datasets that often limit the performance of deep learning methods: the improvement increases as the size of the training dataset decreases.
翻译:用于语义分解的深学习管道往往忽视用于培训的附加说明图像上的现有结构信息。 我们提出一个新的后处理模块, 强化对深层学习提供的分解结果感兴趣的对象的结构性知识。 这个模块相当于“ many-to- one- or- none” 的超异图形匹配方法, 并被设计成二次分配问题 。 我们的方法与基于CNN的两个公共数据集的分解( CNN的各主干线) 相比, 一个用于2D RGB 图像( FASSEG)的面部分解, 另一个用于3D MRIS( IBSR)的大脑分解。 评估使用两种结构信息( 远程和方向关系, 这是我们通用框架的超参数 ) 。 在 FASSEG 数据中, 结果表明, 我们的模块提高了CNN的精度约6.3% ( Hausdorf 距离从22.11 到20. 71 ) 。 关于 IBSR 数据, 改进为 51% ( Hausdf 距离从 11.01 到 5.4 ) 。 此外, 我们的学习方法往往显示, 学习的进度将降低为: 的进度为: 学习的进度为: 学习, 的进度为: 学习, 学习的进度为: 学习到学习的强度的极限。