Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models. These attacks aim to distinguish training members from non-members by exploiting differential behavior of the models on member and non-member inputs. The goal of this work is to train ML models that have high membership privacy while largely preserving their utility; we therefore aim for an empirical membership privacy guarantee as opposed to the provable privacy guarantees provided by techniques like differential privacy, as such techniques are shown to deteriorate model utility. Specifically, we propose a new framework to train privacy-preserving models that induces similar behavior on member and non-member inputs to mitigate membership inference attacks. Our framework, called SELENA, has two major components. The first component and the core of our defense is a novel ensemble architecture for training. This architecture, which we call Split-AI, splits the training data into random subsets, and trains a model on each subset of the data. We use an adaptive inference strategy at test time: our ensemble architecture aggregates the outputs of only those models that did not contain the input sample in their training data. We prove that our Split-AI architecture defends against a large family of membership inference attacks, however, it is susceptible to new adaptive attacks. Therefore, we use a second component in our framework called Self-Distillation to protect against such stronger attacks. The Self-Distillation component (self-)distills the training dataset through our Split-AI ensemble, without using any external public datasets. Through extensive experiments on major benchmark datasets we show that SELENA presents a superior trade-off between membership privacy and utility compared to the state of the art.
翻译:会员资格攻击是评估机器学习(ML)模式中隐私泄露情况的关键衡量标准。 这些攻击的目的是通过利用成员和非成员投入模式的不同行为,将培训成员和非成员成员成员与非成员成员区别开来。 这项工作的目标是培训成员隐私程度高的ML模型,同时在很大程度上保护其效用; 因此,我们的目标是,与不同隐私等技术提供的可辨别的隐私保障相比,以实证成员隐私保障,而不是像不同隐私等技术提供的可辨别隐私保障,因为这类技术显示会恶化模型实用性。 具体地说,我们提出一个新的框架来培训隐私保护模型,这种模型对成员和非成员的投入产生类似的行为,以减少会籍攻击。 我们称为SELELENNA的框架有两大主要组成部分。 我们的架构称为SELENNA。 我们国防的第一个组成部分和核心是一个新的连环建筑,我们称之为Splut-AI, 将培训数据分成一个随机子集, 并针对每组的数据模型。 我们在测试时使用适应性策略: 我们的计算值交易结构中收集了只有那些没有在培训数据中包含高级样本的模型的输出的输出结果, 我们的Sliferal ladeal la laim laim laim laim latial laus a a laus a lade laction a lap latime a laction a laus a laus a nation a nation a nation a latime a laction a latime a latime a laction a des to to to to to to to to to the wed to the wedre to the to the wedre to wedre to wedre to weddddre a laut to the wes to thes to weds autds a lautd to wedre laut to thes as as as adddddddddddddddddddddd to we lauts as as a wed lauts as to we laut as a wedre lad ladre as as as as a we ladre a we a we a we a we a we a we lad