Despite achieving promising results in a breadth of medical image segmentation tasks, deep neural networks require large training datasets with pixel-wise annotations. Obtaining these curated datasets is a cumbersome process which limits the application in scenarios where annotated images are scarce. Mixed supervision is an appealing alternative for mitigating this obstacle, where only a small fraction of the data contains complete pixel-wise annotations and other images have a weaker form of supervision. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. Combined with a standard cross-entropy loss over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions in the bottom branch; and (ii) a Kullback-Leibler (KL) divergence term, which transfers the knowledge of the strongly supervised branch to the less-supervised branch and guides the entropy (student-confidence) term to avoid trivial solutions. We show that the synergy between the entropy and KL divergence yields substantial improvements in performance. We also discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. Quantitative and qualitative results on two publicly available datasets demonstrate that our method significantly outperforms other strategies for semantic segmentation within a mixed-supervision framework, as well as recent semi-supervised approaches. Moreover, we show that the branch trained with reduced supervision and guided by the top branch largely outperforms the latter.
翻译:尽管在广泛的医疗图像分割任务中取得了有希望的成果,但深神经网络需要大量的培训数据集,并配有像素说明。获取这些经调整的数据集是一个繁琐的过程,在附加说明的图像稀少的情况下限制了应用。混合监督是缓解这一障碍的一个诱人的办法,因为只有一小部分数据包含完整的像素说明,而其他图像的监管形式较弱。在这项工作中,我们提议一个双部门架构,使上部门(教师)得到强有力的说明,而下一级(学生)则受有限的监督并受上一级部门指导。加上在标签的像素缺乏的情况下,我们的新配方结合了标准的跨作物性损失,在标准像素缺乏的图像上下定义,鼓励学生在底部作出自信的预测;以及(二) 库尔回-利伯尔(KL)偏好术语,将严格监督的分支的知识转移到不那么的分支,并在上一级部门指导下进行。 加上一个标准的交叉性损失损失,我们的新配方在前一级和后一级数据流中进行大幅的变换。