Domain generalization (DG) seeks predictors which perform well on unseen test distributions by leveraging labeled training data from multiple related distributions or domains. To achieve this, the standard formulation optimizes for worst-case performance over the set of all possible domains. However, with worst-case shifts very unlikely in practice, this generally leads to overly-conservative solutions. In fact, a recent study found that no DG algorithm outperformed empirical risk minimization in terms of average performance. In this work, we argue that DG is neither a worst-case problem nor an average-case problem, but rather a probabilistic one. To this end, we propose a probabilistic framework for DG, which we call Probable Domain Generalization, wherein our key idea is that distribution shifts seen during training should inform us of probable shifts at test time. To realize this, we explicitly relate training and test domains as draws from the same underlying meta-distribution, and propose a new optimization problem -- Quantile Risk Minimization (QRM) -- which requires that predictors generalize with high probability. We then prove that QRM: (i) produces predictors that generalize to new domains with a desired probability, given sufficiently many domains and samples; and (ii) recovers the causal predictor as the desired probability of generalization approaches one. In our experiments, we introduce a more holistic quantile-focused evaluation protocol for DG, and show that our algorithms outperform state-of-the-art baselines on real and synthetic data.
翻译:常规一般化 (DG) 寻求在隐蔽的测试分布上表现良好的预测值, 其方法是利用来自多个相关分布或域的标签培训数据。 为此, 标准设计优化了所有可能域中最坏情况的性能。 然而, 在实际中, 最坏情况的变化在实际中极不可能导致过度保守性的解决办法。 事实上, 最近的一项研究发现, DG的算法没有在平均业绩方面超过总体风险最小化。 在这项工作中, 我们争辩说, DG既不是一个最坏的情况问题, 也不是一个平均情况, 而是一个概率性的问题。 为此, 我们建议为DG提供一个概率框架, 我们称之为“ 最坏的多情况化”, 我们称之为“ 最坏情况化”, 我们的主要想法是, 培训期间所看到的分布变化会告诉我们在测试时间里可能发生的变化。 为了实现这一点, 我们明确地把培训和测试领域与测试领域联系起来, 提出一个新的最优化的预测值, 也就是我们所期望的概率(i), 将我们所期望的模型与一个区域进行充分预测。