Neural network architectures have been extensively employed in the fair representation learning setting, where the objective is to learn a new representation for a given vector which is independent of sensitive information. Various representation debiasing techniques have been proposed in the literature. However, as neural networks are inherently opaque, these methods are hard to comprehend, which limits their usefulness. We propose a new framework for fair representation learning that is centered around the learning of "correction vectors", which have the same dimensionality as the given data vectors. Correction vectors may be computed either explicitly via architectural constraints or implicitly by training an invertible model based on Normalizing Flows. We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance. Furthermore, we demonstrate that state-of-the-art results can be achieved by the invertible model. Finally, we discuss the law standing of our methodology in light of recent legislation in the European Union.
翻译:在公平代表性学习环境中广泛采用了神经网络结构,其目标是为独立于敏感信息之外的特定矢量学习新的代表。文献中提出了各种代表贬低性的方法。然而,由于神经网络本质上不透明,这些方法难以理解,限制了其效用。我们提出了一个新的公平代表性学习框架,以学习“校正矢量”为中心,“校正矢量”与给定的数据矢量具有相同的维度。校正矢量可以通过建筑限制来明确计算,或者通过培训基于正常流动的不可逆模式来间接计算。我们实验性地表明,若干受限制的公平代表性学习模式不会在排名或分类业绩方面造成损失。此外,我们证明,最先进的成果可以通过不可忽略的模式实现。最后,我们根据欧洲联盟最近的立法,讨论我们方法的法律立场。