As they have a vital effect on social decision-making, AI algorithms should be not only accurate but also fair. Among various algorithms for fairness AI, learning fair representation (LFR), whose goal is to find a fair representation with respect to sensitive variables such as gender and race, has received much attention. For LFR, the adversarial training scheme is popularly employed as is done in the generative adversarial network type algorithms. The choice of a discriminator, however, is done heuristically without justification. In this paper, we propose a new adversarial training scheme for LFR, where the integral probability metric (IPM) with a specific parametric family of discriminators is used. The most notable result of the proposed LFR algorithm is its theoretical guarantee about the fairness of the final prediction model, which has not been considered yet. That is, we derive theoretical relations between the fairness of representation and the fairness of the prediction model built on the top of the representation (i.e., using the representation as the input). Moreover, by numerical experiments, we show that our proposed LFR algorithm is computationally lighter and more stable, and the final prediction model is competitive or superior to other LFR algorithms using more complex discriminators.
翻译:在公平大赦国际的各种算法中,学习公平代表制(LFR)的目标是在性别和种族等敏感变数方面找到公平的代表性。对于LFR来说,对抗性培训计划像在基因式对抗性网络型算法中所做的那样,被普遍采用。然而,选择歧视者是毫无道理地进行非理性的选择。在本文中,我们提议为LFR制定新的对抗性培训计划,在其中采用综合概率衡量法(IPM),并使用特定参数类比歧视者家庭。拟议的LFR算法的最显著结果是对最后预测模式的公平性作出理论保证,而这种预测模式尚未得到考虑。这就是说,我们从理论上将代表性的公平性与建立在代表性顶端的预测模式的公正性(即,使用代表性作为投入)联系起来。此外,我们通过数字实验,我们表明我们提议的LFR算法的算法在计算上较轻,更稳定,最后预测模型在使用LFRRA的比较复杂,或更具有歧视性。