Machine learning (ML) can help fight pandemics like COVID-19 by enabling rapid screening of large volumes of images. To perform data analysis while maintaining patient privacy, we create ML models that satisfy Differential Privacy (DP). Previous works exploring private COVID-19 models are in part based on small datasets, provide weaker or unclear privacy guarantees, and do not investigate practical privacy. We suggest improvements to address these open gaps. We account for inherent class imbalances and evaluate the utility-privacy trade-off more extensively and over stricter privacy budgets. Our evaluation is supported by empirically estimating practical privacy through black-box Membership Inference Attacks (MIAs). The introduced DP should help limit leakage threats posed by MIAs, and our practical analysis is the first to test this hypothesis on the COVID-19 classification task. Our results indicate that needed privacy levels might differ based on the task-dependent practical threat from MIAs. The results further suggest that with increasing DP guarantees, empirical privacy leakage only improves marginally, and DP therefore appears to have a limited impact on practical MIA defense. Our findings identify possibilities for better utility-privacy trade-offs, and we believe that empirical attack-specific privacy estimation can play a vital role in tuning for practical privacy.
翻译:机器学习(ML) 可以通过快速筛选大量图像来帮助防治COVID-19等传染病。 为了在维护患者隐私的同时进行数据分析,我们创建了满足不同隐私的数据分析模型。以前对私人COVID-19模型的探索部分基于小数据集,提供较弱或不明确的隐私保障,不调查实际隐私。我们建议改进解决这些公开差距。我们考虑到固有的阶级不平衡,并更广泛和更严格地评估公用事业-特权交易的隐私预算。我们的评价得到通过黑箱会员推断攻击(MIAs)对实际隐私进行实证评估的支持。引入的DP应有助于限制MIAs造成的渗漏威胁,而我们的实际分析是检验COVID-19分类任务这一假设的第一个。我们的结果显示,需要的隐私水平可能因任务依赖MIA斯的实际威胁而有所不同。结果进一步表明,随着DP保证的增加,经验性隐私渗漏只会稍有改善,因此,DP似乎对实际的MIA防御作用影响有限。我们的调查结果可以确定更好的使用效用-隐私交易的可能性,我们认为,对攻击性隐私的具体评估可以发挥关键隐私权作用。