This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning. Although supervised deep learning has achieved huge success across various medical image analysis tasks, its success relies on large amounts of high-quality annotations, which can be challenging to acquire in practice. To overcome this limitation, we propose employing a semi-supervised method, i.e., virtual adversarial training (VAT), to leverage and learn useful information underlying in unlabeled data for better classification of breast masses. Accordingly, our VAT-based models have two types of losses, namely supervised and virtual adversarial losses. The former loss acts as in supervised classification, while the latter loss aims at enhancing model robustness against virtual adversarial perturbation, thus improving model generalizability. To evaluate the performance of our VAT-based CAD scheme, we retrospectively assembled a total of 1024 breast mass images, with equal number of benign and malignant masses. A large CNN and a small CNN were used in this investigation, and both were trained with and without the adversarial loss. When the labeled ratios were 40% and 80%, VAT-based CNNs delivered the highest classification accuracy of 0.740 and 0.760, respectively. The experimental results suggest that the VAT-based CAD scheme can effectively utilize meaningful knowledge from unlabeled data to better classify mammographic breast mass images.
翻译:这项研究的目的是利用半监督的学习,为乳腺X线成像分类开发一种新型的计算机辅助诊断(CAD)方案(CAD)方案。尽管监督的深层次学习在各种医学图像分析任务中取得了巨大成功,但其成功依赖于大量高质量的说明,这在实践中可能难以获得。为克服这一限制,我们提议采用半监督方法,即虚拟对抗培训(VAT),利用和学习无标签数据中的有用信息,更好地对乳腺进行分类。因此,我们基于增值税的模型有两类损失,即监督和虚拟对抗性对称损失。以前的亏损属于监督性分类,而后一种亏损则着眼于加强模型的稳健性,防止虚拟对抗性对立性扰动,从而改进模型的可理解性。为了评估我们基于增值税的CAD计划的绩效,我们追溯地收集了总共1024个乳腺成像和恶性成像的总数。在这次调查中使用了大型CNN和小型CNN, 两者都经过了对抗性亏损的训练。当贴标签的40类质量比率分别为40和80%和80%的MA线上最高水平数据分类时,可以有效地使用。