In the last few years, deep learning classifiers have shown promising results in image-based medical diagnosis. However, interpreting the outputs of these models remains a challenge. In cancer diagnosis, interpretability can be achieved by localizing the region of the input image responsible for the output, i.e. the location of a lesion. Alternatively, segmentation or detection models can be trained with pixel-wise annotations indicating the locations of malignant lesions. Unfortunately, acquiring such labels is labor-intensive and requires medical expertise. To overcome this difficulty, weakly-supervised localization can be utilized. These methods allow neural network classifiers to output saliency maps highlighting the regions of the input most relevant to the classification task (e.g. malignant lesions in mammograms) using only image-level labels (e.g. whether the patient has cancer or not) during training. When applied to high-resolution images, existing methods produce low-resolution saliency maps. This is problematic in applications in which suspicious lesions are small in relation to the image size. In this work, we introduce a novel neural network architecture to perform weakly-supervised segmentation of high-resolution images. The proposed model selects regions of interest via coarse-level localization, and then performs fine-grained segmentation of those regions. We apply this model to breast cancer diagnosis with screening mammography, and validate it on a large clinically-realistic dataset. Measured by Dice similarity score, our approach outperforms existing methods by a large margin in terms of localization performance of benign and malignant lesions, relatively improving the performance by 39.6% and 20.0%, respectively. Code and the weights of some of the models are available at https://github.com/nyukat/GLAM
翻译:在过去几年里,深层次的学习分类显示在基于图像的医疗诊断中取得了有希望的结果。然而,解释这些模型的输出结果仍然是一项挑战。在癌症诊断中,通过对输出所在输入图像区域进行本地化,即可实现可解释性。或者,在培训中,对分解或检测模型进行像素说明的培训,显示恶性损伤的位置。不幸的是,获得这种标签是劳动密集型的,需要医学专门知识。为了克服这一困难,可以使用薄弱的超强本地化。这些方法允许神经网络对突出显示与分类任务最相关的输入区域(如乳房图中的恶性损伤)的显著本地化地图进行分类。在培训中,只有图像级别标签(如病人是否患有癌症)的本地化或检测模型可以使用像素说明恶性损伤的位置。在应用高分辨率图像时,现有的方法产生低分辨率显性图。在与图像大小相比,使用可疑的相对性差的本地化方法。在这项工作中,我们引入了一个新的神经网络结构结构来显示与相对相关的投入区域(例如,在高分辨率分析区域中),在高分辨率分析中,在高层次分析中选择了高层次分析这些图像,然后通过高层次分析中选择了高层次数据。