A popular assumption for out-of-distribution generalization is that the training data comprises sub-datasets, each drawn from a distinct distribution; the goal is then to "interpolate" these distributions and "extrapolate" beyond them -- this objective is broadly known as domain generalization. A common belief is that ERM can interpolate but not extrapolate and that the latter is considerably more difficult, but these claims are vague and lack formal justification. In this work, we recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test distributions. Under an existing notion of inter- and extrapolation based on reweighting of sub-group likelihoods, we rigorously demonstrate that extrapolation is computationally much harder than interpolation, though their statistical complexity is not significantly different. Furthermore, we show that ERM -- or a noisy variant -- is provably minimax-optimal for both tasks. Our framework presents a new avenue for the formal analysis of domain generalization algorithms which may be of independent interest.
翻译:对分配外的笼统化的流行假设是,培训数据包括子数据集,每个数据来自不同的分布;然后的目标是“内插”这些分布和“外推”它们之外 -- -- 这个目标被广泛称为域的概括化。一个共同的信念是,机构风险管理可以内插,但不能外推,而后者则困难得多,但这些主张含糊不清,缺乏正式理由。在这项工作中,我们重新将分组的概括化作为玩家尽量减少风险和提出新测试分布的对手之间的在线游戏。根据基于重新加权分组可能性的现有的内部和外推概念,我们严格地证明,外推法在计算上要比内推法难得多,尽管其统计复杂性并不大。此外,我们表明,机构风险管理 -- -- 或吵闹的变体 -- -- 对于这两项任务来说都是可被忽略的微缩轴-最优化。我们的框架为正式分析可能具有独立利益的域通用算法提供了新的途径。