Improvement of worst group performance and generalization performance are core problems of current machine learning. There are diverse efforts to increase performance, such as weight norm penalty and data augmentation, but the improvements are limited. Recently, there have been two promising approaches to increase the worst group performance and generalization performance, respectively. Distributionally robust optimization (DRO) focuses on the worst or hardest group to improve the worst-group performance. Besides, sharpness-aware minimization (SAM) finds the flat minima to increase the generalization ability on an unseen dataset. They show significant performance improvements on the worst-group dataset and unseen dataset, respectively. However, DRO does not guarantee flatness, and SAM does not guarantee the worst group performance improvement. In other words, DRO and SAM may fail to increase the worst group performance when the training and test dataset shift occurs. In this study, we propose a new approach, the sharpness-aware group distributionally robust optimization (SGDRO). SGDRO finds the flat-minima that generalizes well on the worst group dataset. Different from DRO and SAM, SGDRO contributes to improving the generalization ability even the distribution shift occurs. We validate that SGDRO shows the smaller maximum eigenvalue and improved performance in the worst group.
翻译:改进最差群体业绩和普及性业绩是当前机器学习的核心问题。提高业绩的努力多种多样,例如重量标准罚款和数据增强,但改进有限。最近,有两种提高最差群体业绩和普及性业绩的有希望的办法,分别有两种提高最差群体业绩和普及性业绩的有希望的办法。分布强力优化(DRO)侧重于最差或最难群体,以提高最坏群体业绩。此外,锐力智能最小化(SAM)发现一个平坦的小微米可以提高一个看不见数据集的普及性能力。它们显示最坏群体数据集和未见数据集的显著改进性能改进。但是,DRO并不保证平和SAM没有保证最差的群体业绩改善。换句话说,DRO和SAM可能无法提高最差群体业绩,在培训和测试数据集变换时,我们提出一个新的办法,即敏锐度-觉察群体分布最强的优化(SGDRO)。SGRO认为,最差的平米度改进了最坏群体数据集。不同于DRO,SGDRO, SGGRE和最差的提高性能变。