能否为Bias渗漏保护性属性查询? (Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity)

Existing regulations prohibit model developers from accessing protected attributes (gender, race, etc.), often resulting in fairness assessments on populations without knowing their protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset for auditing purpose). However, the model developers might be allowed to test their models for bias by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. In particular, we show that one can reconstruct the protected attributes of all the individuals from O(Nk log n/Nk) queries when Nk<<n using techniques from compressed sensing (n: size of the test dataset, Nk: size of smallest group). Our results pose an interesting debate in algorithmic fairness: should querying for fairness metrics be viewed as a neutral-valued solution to ensure compliance with regulations? Or, does it constitute a violation of regulations and privacy if the number of queries answered is enough for the model developers to identify the protected attributes of specific individuals? To address this supposed violation, we also propose Attribute-Conceal, a novel technique that achieves differential privacy by calibrating noise to the smooth sensitivity of our bias query, outperforming naive techniques such as Laplace mechanism. We also include experimental results on the Adult dataset and synthetic data (broad range of parameters).

翻译：现有规章禁止模式开发者获取受保护的属性(性别、种族等),这往往导致在不了解受保护群体的情况下对人口进行公平评估。在这样的情况下,机构往往对模型开发者(培训无法获取受保护属性的模型)和合规小组(为审计目的,他们可以访问整个数据集)进行区分。然而,模式开发者可以通过询问合规团队的团体公平度度度度来测试其偏向模式。在本文中,我们首先证明,仅仅询问公平度度度度度标准,例如统计平衡和公平率标准,就可以将个人受保护的属性泄露给模型开发者。在这种情形下,机构往往采用一些战略,模型开发者可以将测试数据集中的目标个人受保护属性从单一查询出来。特别是,我们表明,当Nk ⁇ n查询时,可以通过使用压缩的准确度测深度技术(n:测试数据集的大小,Nk:最小组的大小),我们的结果可以确保算法上的公平性参数。我们总是有这样的战略,对于公平度标准进行查询,在测试中标度标准中,我们也可以将精确度的精确度要求理解为,我们所理解的精确度的精确度要求的值是,我们所理解的正确度的正确度标准。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日