The raise of machine learning and deep learning led to significant improvement in several domains. This change is supported by both the dramatic rise in computation power and the collection of large datasets. Such massive datasets often include personal data which can represent a threat to privacy. Membership inference attacks are a novel direction of research which aims at recovering training data used by a learning algorithm. In this paper, we develop a mean to measure the leakage of training data leveraging a quantity appearing as a proxy of the total variation of a trained model near its training samples. We extend our work by providing a novel defense mechanism. Our contributions are supported by empirical evidence through convincing numerical experiments.
翻译:机器学习和深层次学习的提升导致若干领域的显著改善。这一变化得到计算能力剧增和大型数据集收集工作的支持。这种庞大的数据集往往包括个人数据,对隐私构成威胁。会员推论攻击是旨在恢复学习算法使用的培训数据的新研究方向。在本文中,我们制定了一种衡量培训数据渗漏的手段,利用数量作为培训模型在培训样本附近完全变异的替代物。我们通过提供新的防御机制扩展我们的工作。我们的贡献得到了有说服力的数字实验的经验证据的支持。