In semi-supervised medical image segmentation, there exist empirical mismatch problems between labeled and unlabeled data distribution. The knowledge learned from the labeled data may be largely discarded if treating labeled and unlabeled data separately or in an inconsistent manner. We propose a straightforward method for alleviating the problem - copy-pasting labeled and unlabeled data bidirectionally, in a simple Mean Teacher architecture. The method encourages unlabeled data to learn comprehensive common semantics from the labeled data in both inward and outward directions. More importantly, the consistent learning procedure for labeled and unlabeled data can largely reduce the empirical distribution gap. In detail, we copy-paste a random crop from a labeled image (foreground) onto an unlabeled image (background) and an unlabeled image (foreground) onto a labeled image (background), respectively. The two mixed images are fed into a Student network and supervised by the mixed supervisory signals of pseudo-labels and ground-truth. We reveal that the simple mechanism of copy-pasting bidirectionally between labeled and unlabeled data is good enough and the experiments show solid gains (e.g., over 21% Dice improvement on ACDC dataset with 5% labeled data) compared with other state-of-the-arts on various semi-supervised medical image segmentation datasets. Code is available at https://github.com/DeepMed-Lab-ECNU/BCP}.
翻译:在半监督医学图像分割中,标记和未标记数据分布之间存在经验不匹配问题。如果单独或以不一致的方式处理标记和未标记的数据,则学习自标记数据中获得的知识可能会被大量丢弃。我们提出了一个简单的方法来缓解这个问题——在简单的均值教师架构中双向复制-粘贴标记和未标记的数据。该方法鼓励未标记的数据从内向外以及从外向内学习来自标记数据的综合公共语义。更重要的是,标记和未标记数据的一致性学习过程可以大大减少经验分布差距。具体来说,我们在未标记的图像(背景)上将标记的图像(前景)随机裁剪并粘贴,在标记的图像(背景)上将未标记的图像(前景)随机裁剪并粘贴。这两个混合图像被输入到学生网络中,并由伪标签和真实标签的混合监督信号进行监督。我们揭示了在标记和未标记数据之间双向复制-粘贴的简单机制已经足够,实验结果表明,与其他各种半监督医学图像分割数据集的最新技术相比,我们的方法获得了可靠的收益(例如,在只有5%的标记数据的ACDC数据集上,Dice指数提高了21%)。代码可在https://github.com/DeepMed-Lab-ECNU/BCP中获得。