Learning models that gracefully handle distribution shifts is central to research on domain generalization, robust optimization, and fairness. A promising formulation is domain-invariant learning, which identifies the key issue of learning which features are domain-specific versus domain-invariant. An important assumption in this area is that the training examples are partitioned into "domains" or "environments". Our focus is on the more common setting where such partitions are not provided. We propose EIIL, a general framework for domain-invariant learning that incorporates Environment Inference to directly infer partitions that are maximally informative for downstream Invariant Learning. We show that EIIL outperforms invariant learning methods on the CMNIST benchmark without using environment labels, and significantly outperforms ERM on worst-group performance in the Waterbirds and CivilComments datasets. Finally, we establish connections between EIIL and algorithmic fairness, which enables EIIL to improve accuracy and calibration in a fair prediction problem.
翻译:优雅地处理分配转移的学习模式是研究领域一般化、强力优化和公平性的核心。有希望的提法是域内差异性学习,它确定了学习的关键问题,哪些特征是域内特有的,哪些是域内差异性。该领域的一个重要假设是,培训范例被分割成“域内”或“环境”。我们的重点是不提供这种分区的更常见的环境。我们建议EL,这是一个域内差异性学习的总框架,它包括环境推理,直接推断出对下游差异性学习具有最大信息的分区。我们表明,EL不使用环境标签,就CMNIST基准而言,在CMNIST基准上优于差异性学习方法,并且明显优于Waterbird和CivilComments数据集中最差组的绩效。最后,我们在EL和算法公正之间建立联系,使EL能够在一个公平的预测问题中提高准确性和校准度。