通过职能机制的不同私人反事实 (Differentially Private Counterfactuals via Functional Mechanism)

Counterfactual, serving as one emerging type of model explanation, has attracted tons of attentions recently from both industry and academia. Different from the conventional feature-based explanations (e.g., attributions), counterfactuals are a series of hypothetical samples which can flip model decisions with minimal perturbations on queries. Given valid counterfactuals, humans are capable of reasoning under ``what-if'' circumstances, so as to better understand the model decision boundaries. However, releasing counterfactuals could be detrimental, since it may unintentionally leak sensitive information to adversaries, which brings about higher risks on both model security and data privacy. To bridge the gap, in this paper, we propose a novel framework to generate differentially private counterfactual (DPC) without touching the deployed model or explanation set, where noises are injected for protection while maintaining the explanation roles of counterfactual. In particular, we train an autoencoder with the functional mechanism to construct noisy class prototypes, and then derive the DPC from the latent prototypes based on the post-processing immunity of differential privacy. Further evaluations demonstrate the effectiveness of the proposed framework, showing that DPC can successfully relieve the risks on both extraction and inference attacks.

翻译：反事实是一种新兴的模型解释,最近引起了产业界和学术界的注意。反事实与传统的基于特征的解释(例如归属)不同,反事实是一系列假设的样本,这些样本可以将示范决定翻转,对询问的干扰最小。鉴于有效的反事实,人类可以在“万一”情况下进行推理,以便更好地了解示范决定界限。然而,释放反事实可能是有害的,因为它可能无意地向对手泄漏敏感信息,从而给模型安全和数据隐私带来更大的风险。为了弥合这一差距,我们在本文件中提出了一个新的框架,在不触及已部署的模式或解释组合的情况下,产生差异性私人反事实(DPC),在其中,噪音被注入保护,同时保留反事实的解释作用。特别是,我们用功能机制培训一个自动编码器,以构建噪音类原型,然后从基于处理后隐私豁免的潜伏原型中获取DPC。进一步评估显示拟议框架的有效性,显示DPC能够成功减轻攻击的风险,同时显示DPC能够成功减轻攻击的风险。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/