Social and real-world considerations such as robustness, fairness, social welfare and multi-agent tradeoffs have given rise to multi-distribution learning paradigms, such as collaborative, group distributionally robust, and fair federated learning. In each of these settings, a learner seeks to minimize its worst-case loss over a set of $n$ predefined distributions, while using as few samples as possible. In this paper, we establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity. Importantly, our sample complexity bounds exceed that of the sample complexity of learning a single distribution only by an additive factor of $n \log(n) / \epsilon^2$. These improve upon the best known sample complexity of agnostic federated learning by Mohri et al. by a multiplicative factor of $n$, the sample complexity of collaborative learning by Nguyen and Zakynthinou by a multiplicative factor $\log n / \epsilon^3$, and give the first sample complexity bounds for the group DRO objective of Sagawa et al. To achieve optimal sample complexity, our algorithms learn to sample and learn from distributions on demand. Our algorithm design and analysis is enabled by our extensions of stochastic optimization techniques for solving stochastic zero-sum games. In particular, we contribute variants of Stochastic Mirror Descent that can trade off between players' access to cheap one-off samples or more expensive reusable ones.
翻译:稳健性、 公平性、 社会福利和多试剂交易等社会和现实世界的考虑, 如稳健性、 公平性、 社会福利和多试剂交易等, 产生了多种分布式学习模式, 例如协作性、 群体分布强度强 和公平联合学习。 在其中的每一种环境中, 学习者都力求在尽可能少使用样本的情况下, 将一组预先界定的美元发行中最坏情况的损失降到最低, 并尽量少使用更多的样本。 在本文中, 我们建立这些学习模式的最佳样本复杂性, 并给出符合这种抽样复杂性的算法。 重要的是, 我们的样本复杂性超过抽样样本复杂性的界限, 仅以美元=log(n) / /\ epsilon=2$的添加系数来学习一个单一分配的样本。 在已知的最典型的杂交性学习中, 莫里和阿尔。 以多复制性系数来学习Nguy和Zakynthou合作学习的样本复杂性 。 在Sagawa 和al Explical 中, 学习我们最优性交易中最优的试算法, 和最优性交易的复制性分析。 在Shamblical 中, 和最优性交易中, 学习我们最优性交易中, 学习了我们最易的升级的升级的升级的复制性分析。