私人、公平和准确的私人、公平和准确:培训医疗成像方面的大规模、保护隐私的AI模型</s> (Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging)

Artificial intelligence (AI) models are increasingly used in the medical domain. However, as medical data is highly sensitive, special precautions to ensure its protection are required. The gold standard for privacy preservation is the introduction of differential privacy (DP) to model training. Prior work indicates that DP has negative implications on model accuracy and fairness, which are unacceptable in medicine and represent a main barrier to the widespread use of privacy-preserving techniques. In this work, we evaluated the effect of privacy-preserving training of AI models for chest radiograph diagnosis regarding accuracy and fairness compared to non-private training. For this, we used a large dataset (N=193,311) of high quality clinical chest radiographs, which were retrospectively collected and manually labeled by experienced radiologists. We then compared non-private deep convolutional neural networks (CNNs) and privacy-preserving (DP) models with respect to privacy-utility trade-offs measured as area under the receiver-operator-characteristic curve (AUROC), and privacy-fairness trade-offs, measured as Pearson's r or Statistical Parity Difference. We found that the non-private CNNs achieved an average AUROC score of 0.90 +- 0.04 over all labels, whereas the DP CNNs with a privacy budget of epsilon=7.89 resulted in an AUROC of 0.87 +- 0.04, i.e., a mere 2.6% performance decrease compared to non-private training. Furthermore, we found the privacy-preserving training not to amplify discrimination against age, sex or co-morbidity. Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.

翻译：人工智能(AI)模型越来越多地用于医疗领域。然而,由于医疗数据高度敏感,需要特殊预防措施来确保保护。隐私保护金标准是引入差异隐私(DP)模型培训。先前的工作表明,DP对模型准确性和公平性有负面影响,这在医学中是不可接受的,是广泛使用隐私保护技术的主要障碍。在这项工作中,我们评估了对乳房射线模型进行隐私保护培训,以进行与非私人培训相比的准确性和公平性诊断。为此,我们使用了高质量临床乳房放射系统(N=193,311)的大型数据集(N=193,311),这是由有经验的放射学家追溯收集并手工标注的。我们比较了非私人深度革命神经网络(CNN)和隐私保护(DP)模型。我们评估了在接收器-感官精度曲线(AUROC)下衡量的隐私和公平性交易的影响。我们用Pearson(NRR)或统计精度(PER04)的准确性放射(Orality)测试测量了整个预算水平,我们发现,我们没有在Oral-PIRA(O)的性别-deal-de Deal) 数据分析(O)中发现,我们发现,而不是预算平均(OD)数据(Oraldealde)的正常数据,我们发现非预算(O)。</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日