Despite recent advances in fairness-aware machine learning, predictive models often exhibit discriminatory behavior towards marginalized groups. Such unfairness might arise from biased training data, model design, or representational disparities across groups, posing significant challenges in high-stakes decision-making domains such as college admissions. While existing fair learning models aim to mitigate bias, achieving an optimal trade-off between fairness and accuracy remains a challenge. Moreover, the reliance on black-box models hinders interpretability, limiting their applicability in socially sensitive domains. In this paper, we try to circumvent these issues by integrating Kolmogorov-Arnold Networks (KANs) within a fair adversarial learning framework. Leveraging the adversarial robustness and interpretability of KANs, our approach enables a balance between fairness and accuracy. To further facilitate this balance, we propose an adaptive penalty update mechanism that dynamically adjusts fairness constraints during the model training. We conduct numerical experiments on two real-world college admissions datasets, across three different optimization strategies. The results demonstrate the efficiency and robustness of KANs by consistently outperforming the baseline fair learning models, and maintaining high predictive accuracy while achieving competitive fairness across sensitive attributes.
翻译:尽管公平感知机器学习领域近期取得了进展,但预测模型仍常对边缘化群体表现出歧视性行为。此类不公平可能源于有偏的训练数据、模型设计或群体间的表征差异,在大学录取等高风险决策领域构成重大挑战。现有公平学习模型虽致力于缓解偏差,但在公平性与准确性之间实现最优权衡仍具挑战性。此外,对黑盒模型的依赖阻碍了可解释性,限制了其在社会敏感领域的应用。本文尝试通过将Kolmogorov-Arnold网络(KANs)整合至公平对抗学习框架中来规避这些问题。利用KANs的对抗鲁棒性与可解释性,我们的方法能够平衡公平性与准确性。为进一步促进这种平衡,我们提出一种自适应惩罚更新机制,可在模型训练期间动态调整公平性约束。我们在两个真实世界大学录取数据集上,针对三种不同优化策略进行了数值实验。结果表明,KANs通过持续优于基线公平学习模型,在保持高预测准确性的同时实现跨敏感属性的竞争性公平,验证了其高效性与鲁棒性。