Machine learning (ML) models used in medical imaging diagnostics can be vulnerable to a variety of privacy attacks, including membership inference attacks, that lead to violations of regulations governing the use of medical data and threaten to compromise their effective deployment in the clinic. In contrast to most recent work in privacy-aware ML that has been focused on model alteration and post-processing steps, we propose here a novel and complementary scheme that enhances the security of medical data by controlling the data sharing process. We develop and evaluate a privacy defense protocol based on using a generative adversarial network (GAN) that allows a medical data sourcer (e.g. a hospital) to provide an external agent (a modeler) a proxy dataset synthesized from the original images, so that the resulting diagnostic systems made available to model consumers is rendered resilient to privacy attackers. We validate the proposed method on retinal diagnostics AI used for diabetic retinopathy that bears the risk of possibly leaking private information. To incorporate concerns of both privacy advocates and modelers, we introduce a metric to evaluate privacy and utility performance in combination, and demonstrate, using these novel and classical metrics, that our approach, by itself or in conjunction with other defenses, provides state of the art (SOTA) performance for defending against privacy attacks.
翻译:医疗成像诊断中使用的机器学习(ML)模型可能易受各种隐私攻击,包括会籍推断攻击,从而导致违反医疗数据使用条例,并有可能损害其在诊所的有效部署; 与最近侧重于示范改造和后处理步骤的隐私意识ML工作相比,我们在此提议一个新颖和补充计划,通过控制数据分享过程,加强医疗数据安全; 我们制定和评价一项隐私保护协议,其基础是使用基因对抗网络(GAN),使医疗数据源头(例如医院)能够提供外部代理数据(一个模型),根据原始图像合成的代理数据组,从而使模型消费者获得的诊断系统对隐私攻击者具有复原力; 我们确认拟议的用于糖尿病致癌性肾病治疗的方法,该方法有可能导致私人信息泄漏的风险; 我们制定和评价一项隐私倡导者和模型网络(GAN)的隐私保护协议,以评价隐私和公用事业的性能,并用这些新式和古典衡量仪显示,利用这些新式和古典测量仪,使模型的消费者能够对隐私攻击者进行抵抗; 我们确认,通过国家防护技术本身,从而维护隐私。