Motivated by the need for fair algorithmic decision making in the age of automation and artificially-intelligent technology, this technical report provides a theoretical insight into adversarial training for fairness in deep learning. We build upon previous work in adversarial fairness, show the persistent tradeoff between fair predictions and model performance, and explore further mechanisms that help in offsetting this tradeoff.
翻译:出于在自动化和人工智能技术时代需要公平算法决策这一需要,本技术报告从理论上深入了解了为公平学习而进行的对抗性培训。 我们以先前的对抗公平性工作为基础,展示了公平预测和模型性能之间的持续权衡,并探索了有助于抵消这种权衡的进一步机制。