Researchers have proposed many methods for fair and robust machine learning, but comprehensive empirical evaluation of their subgroup robustness is lacking. In this work, we address this gap in the context of tabular data, where sensitive subgroups are clearly-defined, real-world fairness problems abound, and prior works often do not compare to state-of-the-art tree-based models as baselines. We conduct an empirical comparison of several previously-proposed methods for fair and robust learning alongside state-of-the-art tree-based methods and other baselines. Via experiments with more than $340{,}000$ model configurations on eight datasets, we show that tree-based methods have strong subgroup robustness, even when compared to robustness- and fairness-enhancing methods. Moreover, the best tree-based models tend to show good performance over a range of metrics, while robust or group-fair models can show brittleness, with significant performance differences across different metrics for a fixed model. We also demonstrate that tree-based models show less sensitivity to hyperparameter configurations, and are less costly to train. Our work suggests that tree-based ensemble models make an effective baseline for tabular data, and are a sensible default when subgroup robustness is desired. For associated code and detailed results, see https://github.com/jpgard/subgroup-robustness-grows-on-trees .
翻译:研究人员提出了许多公平和稳健的机器学习方法,但缺乏对其分组稳健性的全面实证评估。在这项工作中,我们从表格数据的角度解决了这一差距,在表格数据中,敏感分组是定义明确、真实世界公平的问题,而以前的工作往往不与以树为基础的最先进模型作为基线进行比较。我们用经验比较了先前提出的一些公平和稳健的学习方法,以及以树为基础的最先进方法和其他基线。在8个数据集上,以340{{{{3000美元为模型配置值的虚拟实验,我们表明,以树为基础的方法具有很强的分组强健。此外,以树为基础的最佳模型往往显示在一系列衡量标准上表现良好,而稳健或群体公平的模型则显示细小的弱点,而不同标准在固定模型上的表现差异很大。我们还表明,以树为基础的模型对超标度配置的敏感性较低,而且培训费用较低。我们的工作表明,以树为基础的方法具有很强的分组强健美的模型,当我们发现,以树为基础的精度模型和高劣的分组时,则会看到一个可靠的基准。