The wide adoption and application of Masked language models~(MLMs) on sensitive data (from legal to medical) necessitates a thorough quantitative investigation into their privacy vulnerabilities -- to what extent do MLMs leak information about their training data? Prior attempts at measuring leakage of MLMs via membership inference attacks have been inconclusive, implying the potential robustness of MLMs to privacy attacks. In this work, we posit that prior attempts were inconclusive because they based their attack solely on the MLM's model score. We devise a stronger membership inference attack based on likelihood ratio hypothesis testing that involves an additional reference MLM to more accurately quantify the privacy risks of memorization in MLMs. We show that masked language models are extremely susceptible to likelihood ratio membership inference attacks: Our empirical results, on models trained on medical notes, show that our attack improves the AUC of prior membership inference attacks from 0.66 to an alarmingly high 0.90 level, with a significant improvement in the low-error region: at 1% false positive rate, our attack is 51X more powerful than prior work.
翻译:广泛采用和应用关于敏感数据(从法律到医疗)的蒙面语言模型~(MLMs),需要对隐私脆弱性进行彻底的定量调查 -- -- MLMs泄漏有关其培训数据的信息的程度有多大?以前试图通过会员推论攻击测量MLMs渗漏的可能性是没有结果的,这意味着MLMs对隐私攻击具有潜在的稳健性。在这项工作中,我们假设先前的尝试是没有结果的,因为它们完全以MLM模型的得分作为攻击的依据。我们根据概率比假设测试设计了更强烈的会员推论攻击,这需要更多参考MLM(M)来更准确地量化MLMs中记忆化的隐私风险。我们表明,蒙面语言模型极有可能受到会员推论攻击的可能性:我们关于医学笔记培训模型的经验结果表明,我们的攻击使AUC公司先前的推论攻击从0.66增加到惊人的0.90级,低eror地区的情况有了显著的改善:以1%的假正率,我们的攻击比以前的工作更强大。