Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning the meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned representation. The situation becomes even more serious in medical imaging, where the clinical data (e.g., MR images with pathology) are limited and scarce while the reliability, generalizability and transparency of the learned model are highly required. To address this problem, we propose to infuse human experts' intelligence and domain knowledge into the training of deep neural networks. The core idea is that we infuse the visual attention information from expert radiologists to proactively guide the deep model to focus on regions with potential pathology and avoid being trapped in learning harmful shortcuts. To do so, we propose a novel eye-gaze-guided vision transformer (EG-ViT) for diagnosis with limited medical image data. We mask the input image patches that are out of the radiologists' interest and add an additional residual connection in the last encoder layer of EG-ViT to maintain the correlations of all patches. The experiments on two public datasets of INbreast and SIIM-ACR demonstrate our EG-ViT model can effectively learn/transfer experts' domain knowledge and achieve much better performance than baselines. Meanwhile, it successfully rectifies the harmful shortcut learning and significantly improves the EG-ViT model's interpretability. In general, EG-ViT takes the advantages of both human expert's prior knowledge and the power of deep neural networks. This work opens new avenues for advancing current artificial intelligence paradigms by infusing human intelligence.
翻译:在医学成像中,情况变得更加严重,因为临床数据(如具有病理学的MR图像)有限和稀缺,而所学模型的可靠性、普遍性和透明度则非常必要。为了解决这一问题,我们提议将人类专家的智慧和领域知识纳入深神经网络的培训中。核心思想是,我们利用专家放射学家的视觉关注信息,积极主动地引导深层模型关注潜在的病理学区域,避免被困在学习有害捷径上。为了做到这一点,我们建议用有限的医学图像数据来诊断临床数据(如MR图象)有限和稀缺。我们用放射学家的兴趣来掩盖输入图像的补丁,并在EG-ViT的最后一个精密神经网络中添加额外的残余连接。我们从专家中获取的视觉关注信息可以积极指导深层神经模型,以关注潜在的病理学区域,避免被困在学习有害捷径上。我们的两个公共数据轨迹上的实验,可以大大地理解人类甚低地球学数据库和SIM-CRR 。