Machine learning is expected to fuel significant improvements in medical care. To ensure that fundamental principles such as beneficence, respect for human autonomy, prevention of harm, justice, privacy, and transparency are respected, medical machine learning systems must be developed responsibly. Many high-level declarations of ethical principles have been put forth for this purpose, but there is a severe lack of technical guidelines explicating the practical consequences for medical machine learning. Similarly, there is currently considerable uncertainty regarding the exact regulatory requirements placed upon medical machine learning systems. This survey provides an overview of the technical and procedural challenges involved in creating medical machine learning systems responsibly and in conformity with existing regulations, as well as possible solutions to address these challenges. First, a brief review of existing regulations affecting medical machine learning is provided, showing that properties such as safety, robustness, reliability, privacy, security, transparency, explainability, and nondiscrimination are all demanded already by existing law and regulations - albeit, in many cases, to an uncertain degree. Next, the key technical obstacles to achieving these desirable properties are discussed, as well as important techniques to overcome these obstacles in the medical context. We notice that distribution shift, spurious correlations, model underspecification, uncertainty quantification, and data scarcity represent severe challenges in the medical context. Promising solution approaches include the use of large and representative datasets and federated learning as a means to that end, the careful exploitation of domain knowledge, the use of inherently transparent models, comprehensive out-of-distribution model testing and verification, as well as algorithmic impact assessments.
翻译:为了确保尊重仁慈、尊重人类自主、预防伤害、正义、隐私和透明度等基本原则,必须负责任地发展医疗机器学习系统,为此提出了许多高层次的道德原则宣言,但现行法律和条例已经要求具备安全、稳健、可靠、隐私、安全、透明、解释性和不歧视等要素,尽管在许多情况下,这种要求的程度还不确定。接着,讨论了实现这些理想特性的主要技术障碍,以及克服医疗方面障碍的重要技术方法。我们注意到,分配的变化、模糊的关联性、在具体程度上具有代表性的大规模数据量化方法。