Attribute-driven privacy aims to conceal a single user's attribute, contrary to anonymisation that tries to hide the full identity of the user in some data. When the attribute to protect from malicious inferences is binary, perfect privacy requires the log-likelihood-ratio to be zero resulting in no strength-of-evidence. This work presents an approach based on normalizing flow that maps a feature vector into a latent space where the strength-of-evidence, related to the binary attribute, and an independent residual are disentangled. It can be seen as a non-linear discriminant analysis where the mapping is invertible allowing generation by mapping the latent variable back to the original space. This framework allows to manipulate the log-likelihood-ratio of the data and thus to set it to zero for privacy. We show the applicability of the approach on an attribute-driven privacy task where the sex information is removed from speaker embeddings. Results on VoxCeleb2 dataset show the efficiency of the method that outperforms in terms of privacy and utility our previous experiments based on adversarial disentanglement.
翻译:属性驱动的隐私旨在隐藏单个用户的属性, 与试图在某些数据中隐藏用户全部身份的匿名特征相反。 当保护用户免受恶意推断的属性为二进制时, 完美的隐私要求日志- 类比为零, 导致没有证据的强度。 这项工作提出了一个基于正常流的方法, 该方法将特性矢量映射成一个隐蔽空间, 与二进制属性相关的证据强度和独立的剩余部分被分离。 它可以被视为一种非线性对称分析, 通过将潜在变量映射回原始空间是不可忽略的, 从而允许生成潜在变量。 这个框架允许对数据的日志- 类比为零, 从而将数据设定为零, 用于隐私 。 我们展示了属性驱动的隐私任务, 将性信息从演讲者嵌入到隐蔽。 VoxCeleb2 数据集的结果显示在隐私和实用性对立式实验方面超越了方法的效率 。