Most deep learning algorithms lack explanations for their predictions, which limits their deployment in clinical practice. Approaches to improve explainability, especially in medical imaging, have often been shown to convey limited information, be overly reassuring, or lack robustness. In this work, we introduce the task of generating natural language explanations (NLEs) to justify predictions made on medical images. NLEs are human-friendly and comprehensive, and enable the training of intrinsically explainable models. To this goal, we introduce MIMIC-NLE, the first, large-scale, medical imaging dataset with NLEs. It contains over 38,000 NLEs, which explain the presence of various thoracic pathologies and chest X-ray findings. We propose a general approach to solve the task and evaluate several architectures on this dataset, including via clinician assessment.
翻译:大部分深层次的学习算法缺乏对其预测的解释,这些预测限制了它们在临床实践中的应用。改进解释性的方法,特别是在医学成像方面,往往被证明是传递有限的信息,过于令人放心,或缺乏可靠性。在这项工作中,我们引入了产生自然语言解释的任务,以证明在医学图像上作出的预测是合理的。国家教育算法是人类友好和全面性的,并且能够培训内在可以解释的模型。为此,我们引入了MIMIMI-NLE,这是第一个与NLE一起的大规模医学成像数据集。它包含超过38 000个NLE,解释了各种细胞病理和胸前X光检查结果的存在。我们提出了解决任务和评估这一数据集上的若干结构的一般方法,包括通过临床评估。