Methods to detect malignant lesions from screening mammograms are usually trained with fully annotated datasets, where images are labelled with the localisation and classification of cancerous lesions. However, real-world screening mammogram datasets commonly have a subset that is fully annotated and another subset that is weakly annotated with just the global classification (i.e., without lesion localisation). Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it. The first option will reduce detection accuracy because it does not use the whole dataset, and the second option is too expensive given that the annotation needs to be done by expert radiologists. In this paper, we propose a middle-ground solution for the dilemma, which is to formulate the training as a weakly- and semi-supervised learning problem that we refer to as malignant breast lesion detection with incomplete annotations. To address this problem, our new method comprises two stages, namely: 1) pre-training a multi-view mammogram classifier with weak supervision from the whole dataset, and 2) extending the trained classifier to become a multi-view detector that is trained with semi-supervised student-teacher learning, where the training set contains fully and weakly-annotated mammograms. We provide extensive detection results on two real-world screening mammogram datasets containing incomplete annotations, and show that our proposed approach achieves state-of-the-art results in the detection of malignant breast lesions with incomplete annotations.
翻译:检测乳房X射线图中恶性损伤的方法通常经过充分附加说明的数据集的培训,图像在其中贴上癌症损伤的本地化和分类标签。然而,真实世界筛查乳房X射线图数据集通常有一个完全附加说明的子集,而另一个仅以全球分类(即无损伤定位)为弱度附加说明的子集。鉴于这类数据集的庞大规模,研究人员通常会面临与附加说明的薄弱子集相比的两难困境:不使用或充分注解这一问题。第一个选项将降低检测准确性,因为它不使用整个数据集,而第二个选项则太昂贵,因为需要专家放射科专家来做注解。在本文件中,我们提出了一个中下半地面解决方案,即将培训发展成一个微弱和半超强的学习问题。为了解决这一问题,我们的新方法分为两个阶段,即:(1) 预先培训多视乳房X光剖分解器的精准精细度,从整个透析中进行精细的精细的剖析,将经过训练的学生的测结果扩展到一个经过全面研修练的多层次的剖析。