We propose Predict then Interpolate (PI), a simple algorithm for learning correlations that are stable across environments. The algorithm follows from the intuition that when using a classifier trained on one environment to make predictions on examples from another environment, its mistakes are informative as to which correlations are unstable. In this work, we prove that by interpolating the distributions of the correct predictions and the wrong predictions, we can uncover an oracle distribution where the unstable correlation vanishes. Since the oracle interpolation coefficients are not accessible, we use group distributionally robust optimization to minimize the worst-case risk across all such interpolations. We evaluate our method on both text classification and image classification. Empirical results demonstrate that our algorithm is able to learn robust classifiers (outperforms IRM by 23.85% on synthetic environments and 12.41% on natural environments). Our code and data are available at https://github.com/YujiaBao/Predict-then-Interpolate.
翻译:我们提出“预测”然后 Internetate(PI),这是一个简单的算法,用于学习各种环境之间稳定的相互关系。算法源于直觉,即当使用在一种环境中受过训练的分类员对另一种环境的事例作出预测时,其错误会说明哪些关联是不稳定的。在这项工作中,我们通过对正确预测的分布和错误预测进行内插,可以证明在不稳定相关关系消失的地方,我们能够发现一个孔雀分布。由于无法获得这种孔径内推系数,我们使用群体分布上强有力的优化,以尽量减少所有这类内插中最坏的风险。我们评估了我们关于文本分类和图像分类的方法。经验性结果表明,我们的算法能够学习稳健的分类员(合成环境的IRM比23.85%和自然环境的12.41%)。我们的代码和数据可在https://github.com/Yujiaao/Predict-then-Inolpolate查阅。