诚实但有威胁的网络:私人投入的敏感属性可以秘密编码成分类者的产出 (Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs)

It is known that deep neural networks, trained for the classification of a non-sensitive target attribute, can reveal some sensitive attributes of their input data; through features of different granularity extracted by the classifier. We take a step forward and show that deep classifiers can be trained to secretly encode a sensitive attribute of users' input data into the classifier's outputs for the target attribute, at inference time. This results in an attack that works even if users have a full white-box view of the classifier, and can keep all internal representations hidden except for the classifier's outputs for the target attribute. We introduce an information-theoretical formulation of such attacks and present efficient empirical implementations for training honest-but-curious (HBC) classifiers based on this formulation: classifiers that can be accurate in predicting their target attribute, but can also exploit their outputs to secretly encode a sensitive attribute. Our evaluations on several tasks in real-world datasets show that a semi-trusted server can build a classifier that is not only perfectly honest but also accurately curious. Our work highlights a vulnerability that can be exploited by malicious machine learning service providers to attack their user's privacy in several seemingly safe scenarios; such as encrypted inferences, computations at the edge, or private knowledge distillation. We conclude by showing the difficulties in distinguishing between standard and HBC classifiers, discussing challenges in defending against this vulnerability of deep classifiers, and enumerating related open directions for future studies.

翻译：众所周知, 受过对非敏感目标属性进行分类培训的深神经网络可以揭示输入数据的某些敏感属性; 通过分类者提取的不同颗粒特征, 能够揭示输入数据的某些敏感属性。我们向前迈出一步, 并表明深分类者可以接受训练, 秘密将用户输入数据的敏感属性编码成分类者输出的目标属性的敏感属性, 推论时间。这导致袭击, 即使用户对分类器有完整的白箱视图, 并且除了分类者输出的目标属性外, 也可以隐藏所有内部代表。我们引入了这种攻击的信息理论配方, 并展示了基于这种配方培训诚实但有说服力的( HBC) 分类者的有效经验执行情况: 分类者在预测其目标属性时可以精确地将用户输入到分类器输出的敏感属性。我们对于现实世界数据集中的若干任务的评估表明, 半信任服务器可以建立一个不仅非常诚实而且准确的分类, 还能精确地理解。我们的工作凸显了这种风险的脆弱性, 通过恶意的保密的保密性、机密性、机密性、机密性、机密性、机密性、机密性、机密性、机密性、机密性、机密性、我们的分类分析研究、研究、、、、、以及机密性、、等的机密性、、机密性、机密性、、机密性、机密性、机密性、、机密性、机密性、、研究、、、、机密性、机密性、研究、、、、研究、、、、机密性、机密性、机密性、机密性、、研究、、、、、、、、、、、、机密性、、、、、、、、机密性、机密性、、、、、、、机密性、机密性、、、、、、、、、、、、、、、、、、、、、、、机密性、机密性、、、、