Robustness to certain forms of distribution shift is a key concern in many ML applications. Often, robustness can be formulated as enforcing invariances to particular interventions on the data generating process. Here, we study a flexible, causally-motivated approach to enforcing such invariances, paying special attention to shortcut learning, where a robust predictor can achieve optimal i.i.d generalization in principle, but instead it relies on spurious correlations or shortcuts in practice. Our approach uses auxiliary labels, typically available at training time, to enforce conditional independences between the latent factors that determine these labels. We show both theoretically and empirically that causally-motivated regularization schemes (a) lead to more robust estimators that generalize well under distribution shift, and (b) have better finite sample efficiency compared to usual regularization schemes, even in the absence of distribution shifts. Our analysis highlights important theoretical properties of training techniques commonly used in causal inference, fairness, and disentanglement literature.
翻译:对某些分销形式转换的强力是许多 ML 应用中的一个关键问题。 通常, 稳健性可以被表述为对数据生成过程的特定干预的强制操作。 在这里, 我们研究一种灵活和有因果动机的方法来强制实施这种偏差, 特别注意捷径学习, 稳健的预测者可以在原则上实现最佳的一. 一. 一. 概括化, 而在实践上却依赖虚假的关联或捷径。 我们的方法使用培训时通常提供的辅助标签, 强制实施确定这些标签的潜在因素之间的有条件独立。 我们在理论上和经验上都表明, 由因果驱动的正规化计划(a) 导致更加稳健的预估, 使分布变化中普遍化, 以及 (b) 与通常的正规化计划相比, 更具有有限的抽样效率, 即使在没有分配变化的情况下 。 我们的分析强调了在因果关系推论、 和不纠缠不清的文献中通常使用的培训技术的重要理论特性。