Predictive performance of machine learning models trained with empirical risk minimization (ERM) can degrade considerably under distribution shifts. The presence of spurious correlations in training datasets leads ERM-trained models to display high loss when evaluated on minority groups not presenting such correlations. Extensive attempts have been made to develop methods improving worst-group robustness. However, they require group information for each training input or at least, a validation set with group labels to tune their hyperparameters, which may be expensive to get or unknown a priori. In this paper, we address the challenge of improving group robustness without group annotation during training or validation. To this end, we propose to partition the training dataset into groups based on Gram matrices of features extracted by an ``identification'' model and to apply robust optimization based on these pseudo-groups. In the realistic context where no group labels are available, our experiments show that our approach not only improves group robustness over ERM but also outperforms all recent baselines
翻译:培训数据集中存在虚假的关联,导致机构风险管理培训模式在对没有显示这种关联的少数群体进行评估时显示大量亏损。已经作出广泛努力,制定方法改进最差群体的稳健性。然而,它们要求为每项培训投入提供群体信息,或至少为每个培训投入提供一组信息,或至少提供一组标签的验证组标签,以调整其超参数,这些参数可能昂贵,或者事先可能不知道。在本文中,我们讨论了在培训或验证过程中无需集体批注而提高集体稳健性的挑战。为此,我们提议将培训数据集分成基于“身份识别”模型所提取特征的格拉姆矩阵的小组,并根据这些伪组采用稳健的优化。在没有分组标签的现实背景下,我们的实验表明,我们的方法不仅改进了集团对机构风险管理的稳健性,而且超越了最近的所有基线。