The use of machine learning to develop intelligent software tools for interpretation of radiology images has gained widespread attention in recent years. The development, deployment, and eventual adoption of these models in clinical practice, however, remains fraught with challenges. In this paper, we propose a list of key considerations that machine learning researchers must recognize and address to make their models accurate, robust, and usable in practice. Namely, we discuss: insufficient training data, decentralized datasets, high cost of annotations, ambiguous ground truth, imbalance in class representation, asymmetric misclassification costs, relevant performance metrics, generalization of models to unseen datasets, model decay, adversarial attacks, explainability, fairness and bias, and clinical validation. We describe each consideration and identify techniques to address it. Although these techniques have been discussed in prior research literature, by freshly examining them in the context of medical imaging and compiling them in the form of a laundry list, we hope to make them more accessible to researchers, software developers, radiologists, and other stakeholders.
翻译:近些年来,利用机器学习开发用于解析放射图像的智能软件工具的问题受到广泛关注,但这些模型在临床实践中的开发、部署和最终采用仍然充满挑战。我们在本文件中提出了机器学习研究人员必须认识到和解决的关键考虑因素清单,以便使其模型准确、可靠和在实践中使用。也就是说,我们讨论了:培训数据不足、数据集分散化、说明成本高、地面真理模糊不清、阶级代表性不平衡、分类成本不对称、相关性能衡量标准、将模型归纳为看不见数据集、模型腐蚀、对抗性攻击、解释性、公平性和偏向性以及临床验证。我们描述了每一种考虑,并确定了解决这一问题的技术。尽管这些技术在以前的研究文献中已经讨论过,在医学成像学中对其进行了新的研究,并以洗衣清单的形式将其汇编起来,我们希望使研究人员、软件开发者、放射学家和其他利益攸关者更容易获得这些技术。