Clinical decision support using deep neural networks has become a topic of steadily growing interest. While recent work has repeatedly demonstrated that deep learning offers major advantages for medical image classification over traditional methods, clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend. In recent years, this has been addressed by a variety of approaches that have successfully contributed to providing deeper insight. Most notably, additive feature attribution methods are able to propagate decisions back into the input space by creating a saliency map which allows the practitioner to "see what the network sees." However, the quality of the generated maps can become poor and the images noisy if only limited data is available - a typical scenario in clinical contexts. We propose a novel decision explanation scheme based on CycleGAN activation maximization which generates high-quality visualizations of classifier decisions even in smaller data sets. We conducted a user study in which we evaluated our method on the LIDC dataset for lung lesion malignancy classification, the BreastMNIST dataset for ultrasound image breast cancer detection, as well as two subsets of the CIFAR-10 dataset for RBG image object recognition. Within this user study, our method clearly outperformed existing approaches on the medical imaging datasets and ranked second in the natural image setting. With our approach we make a significant contribution towards a better understanding of clinical decision support systems based on deep neural networks and thus aim to foster overall clinical acceptance.
翻译:虽然最近的工作一再表明,深层学习为传统方法的医学图像分类提供了重大优势,但临床医生往往对采用该技术感到犹豫不决,因为其基本决策过程被认为不透明,难以理解。近年来,通过一系列方法,成功地提供了更深入的洞察力,解决了这一问题。最明显的是,添加特征归属方法能够将决定传播到输入空间,绘制一个突出的系统图,使执业者能够“看到网络所见”,从而能够将决定传播到输入空间。然而,如果只提供有限的数据,所制作的地图的质量就会变得差,图像会变得噪音。临床环境中的典型情况是,我们提出一个新的决策解释方案,其基础是循环GAN启动进程,甚至在较小的数据集中产生高品质的分类决定直观化。我们进行了一项用户研究,我们评估了我们用于肺脏病恶性分类的LICDC数据集的方法、用于超声波图像乳腺癌检测的乳房运动数据集,以及CIRFAR-10临床数据集的2个子数据集可能会变得很吵。我们用一个更清晰的用户系统,从而更清楚地了解我们现有的用户系统,从而确定一个更精确的图像目标。