A common problem with segmentation of medical images using neural networks is the difficulty to obtain a significant number of pixel-level annotated data for training. To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning. In contrast to the previous state-of-the-art, we introduce Min-Max Similarity (MMS), a contrastive learning form of dual-view training by employing classifiers and projectors to build all-negative, and positive and negative feature pairs, respectively, to formulate the learning as solving a MMS problem. The all-negative pairs are used to supervise the networks learning from different views and to capture general features, and the consistency of unlabeled predictions is measured by pixel-wise contrastive loss between positive and negative pairs. To quantitatively and qualitatively evaluate our proposed method, we test it on four public endoscopy surgical tool segmentation datasets and one cochlear implant surgery dataset, which we manually annotated. Results indicate that our proposed method consistently outperforms state-of-the-art semi-supervised and fully supervised segmentation algorithms. And our semi-supervised segmentation algorithm can successfully recognize unknown surgical tools and provide good predictions. Also, our MMS approach could achieve inference speeds of about 40 frames per second (fps) and is suitable to deal with the real-time video segmentation.
翻译:使用神经网络对医疗图像进行分解的常见问题是很难获得大量具有像素级附加说明的培训数据。 为了解决这一问题,我们提议了以对比性学习为基础的半监督分解网络。 与以往的艺术水平相比,我们引入了Min-Max相似性(MMS),这是一种对比式的双视培训学习形式,即通过使用分类器和投影器分别建立全负和正与负的外科外科分解数据集,将学习发展成解决MMS问题的方法。 使用全负配对来监督网络从不同的观点中学习和捕捉一般特征,而未贴标签的预测的一致性则通过正对与负对等的平等性对比性损失来衡量。 为了从定量和定性角度评估我们的拟议方法,我们用四个公共底科外科外科外科外科分解视频数据集和一台可切换的外科手术数据集来进行测试。 结果显示,我们拟议的方法始终超越了40级的状态,从不同观点中获取了一般特征的半超级半级和完全监督性外科分析工具。