Researchers have proposed many methods for fair and robust machine learning, but comprehensive empirical evaluation of their subgroup robustness is lacking. In this work, we address this gap in the context of tabular data, where sensitive subgroups are clearly-defined, real-world fairness problems abound, and prior works often do not compare to state-of-the-art tree-based models as baselines. We conduct an empirical comparison of several previously-proposed methods for fair and robust learning alongside state-of-the-art tree-based methods and other baselines. Via experiments with more than $340{,}000$ model configurations on eight datasets, we show that tree-based methods have strong subgroup robustness, even when compared to robustness- and fairness-enhancing methods. Moreover, the best tree-based models tend to show good performance over a range of metrics, while robust or group-fair models can show brittleness, with significant performance differences across different metrics for a fixed model. We also demonstrate that tree-based models show less sensitivity to hyperparameter configurations, and are less costly to train. Our work suggests that tree-based ensemble models make an effective baseline for tabular data, and are a sensible default when subgroup robustness is desired. For associated code and detailed results, see https://github.com/jpgard/subgroup-robustness-grows-on-trees .
翻译:研究人员提出了许多公平且鲁棒的机器学习方法,但对其子群鲁棒性进行全面的实证评估还缺乏。本研究在表格数据的背景下填补了这一空白,其中敏感的子组清晰地定义,现实世界的公平问题层出不穷,先前的工作经常不将基于树的最先进模型作为基线进行比较。我们在八个数据集上进行了多达$340,000$个模型配置的实验,并比较了几种先前提出的公平和鲁棒学习方法与最先进的基于树的方法以及其他基线方法。通过实验,我们发现基于树模型在子群鲁棒性方面具有很强的表现,即使与增强鲁棒性和公平性的方法进行比较也如此。此外,最佳基于树的模型往往在一系列指标上表现良好,而鲁棒或群体公平模型可能表现脆弱,在固定模型的情况下,不同指标之间的性能差异显著。我们还证明,基于树的模型对超参数配置显示出更少的敏感性,并且训练成本更低。我们的工作表明,基于树的集成模型是表格数据的有效基线,在需要子群鲁棒性时是明智的默认选择。有关相关代码和详细结果,请参见 https://github.com/jpgard/subgroup-robustness-grows-on-trees。