Organizations that collect and sell data face increasing scrutiny for the discriminatory use of data. We propose a novel unsupervised approach to transform data into a compressed binary representation independent of sensitive attributes. We show that in an information bottleneck framework, a parsimonious representation should filter out information related to sensitive attributes if they are provided directly to the decoder. Empirical results show that the proposed method, \textbf{FBC}, achieves state-of-the-art accuracy-fairness trade-off. Explicit control of the entropy of the representation bit stream allows the user to move smoothly and simultaneously along both rate-distortion and rate-fairness curves. \end{abstract}
翻译:收集和出售数据的组织在歧视性地使用数据方面面临着越来越多的检查。 我们提议一种新的、不受监督的方法,将数据转换成一个不受敏感属性影响的压缩二进制代表制。 我们表明,在一个信息瓶颈框架内,如果直接提供给解码器,则一个典型的表述式应该过滤与敏感属性有关的信息。 经验性结果显示,拟议的方法(\ textbf{FBC})实现了最新水平的准确性与公平性权衡。 对代表位流的英特质进行明确控制,使用户能够沿着率扭曲和率公平性曲线同时顺利移动。\ end{ amptraty}