Machine Learning services are being deployed in a large range of applications that make it easy for an adversary, using the algorithm and/or the model, to gain access to sensitive data. This paper investigates fundamental bounds on information leakage. First, we identify and bound the success rate of the worst-case membership inference attack, connecting it to the generalization error of the target model. Second, we study the question of how much sensitive information is stored by the algorithm about the training set and we derive bounds on the mutual information between the sensitive attributes and model parameters. Although our contributions are mostly of theoretical nature, the bounds and involved concepts are of practical relevance. Inspired by our theoretical analysis, we study linear regression and DNN models to illustrate how these bounds can be used to assess the privacy guarantees of ML models.
翻译:机器学习服务正在广泛应用中部署,使对手容易利用算法和(或)模型获取敏感数据。本文调查了信息泄漏的基本界限。首先,我们确定并约束最差情况成员推论攻击的成功率,将其与目标模型的笼统错误联系起来。第二,我们研究培训成套的算法储存了多少敏感信息的问题,我们从敏感属性和模型参数之间的相互信息中获取界限。虽然我们的贡献大多是理论性的,但界限和所涉概念具有实际意义。根据我们的理论分析,我们研究线性回归和DNNN模型,以说明如何利用这些界限来评估ML模型的隐私保障。