This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions. First, we formulate GDRO as a stochastic convex-concave saddle-point problem, which is then solved by stochastic mirror descent (SMD) with $m$ samples in each iteration, and attain a nearly optimal sample complexity. To reduce the number of samples required in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts SMD and the other executes an online algorithm for non-oblivious multi-armed bandits, maintaining the same sample complexity. Next, we extend GDRO to address scenarios involving imbalanced data and heterogeneous distributions. In the first scenario, we introduce a weighted variant of GDRO, enabling distribution-dependent convergence rates that rely on the number of samples from each distribution. We design two strategies to meet the sample budget: one integrates non-uniform sampling into SMD, and the other employs the stochastic mirror-prox algorithm with mini-batches, both of which deliver faster rates for distributions with more samples. In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of outlier distributions. Similar to the case of vanilla GDRO, we develop two stochastic approaches: one uses $m$ samples per iteration via SMD, and the other consumes $k$ samples per iteration through an online algorithm for non-oblivious combinatorial semi-bandits.
翻译:暂无翻译