In machine learning, training data often capture the behaviour of multiple subgroups of some underlying human population. When the nature of training data for subgroups are not controlled carefully, under-representation bias arises. To counter this effect we introduce two natural notions of subgroup fairness and instantaneous fairness to address such under-representation bias in time-series forecasting problems. Here we show globally convergent methods for the fairness-constrained learning problems using hierarchies of convexifications of non-commutative polynomial optimisation problems. Our empirical results on a biased data set motivated by insurance applications and the well-known COMPAS data set demonstrate the efficacy of our methods. We also show that by exploiting sparsity in the convexifications, we can reduce the run time of our methods considerably.
翻译:在机器学习中,培训数据往往能捕捉到一些基本人类群体中多个分组的行为。当分组培训数据的性质没有得到仔细控制时,就会出现代表性不足的偏差。为了消除这一影响,我们引入了两个自然的子分组公平性和即时公平概念,以解决时间序列预测问题中这种代表性不足的偏差。在这里,我们展示了利用非平衡性多元优化问题的分级法解决公平性受限制的学习问题的全球趋同方法。我们在受保险应用程序和众所周知的COMPAS数据集驱动的偏差数据集方面的实证结果显示了我们的方法的有效性。我们还表明,通过利用凝聚的松散性,我们可以大大缩短我们方法的运行时间。