Covariate-shift generalization, a typical case in out-of-distribution (OOD) generalization, requires a good performance on the unknown test distribution, which varies from the accessible training distribution in the form of covariate shift. Recently, independence-driven importance weighting algorithms in stable learning literature have shown empirical effectiveness to deal with covariate-shift generalization on several learning models, including regression algorithms and deep neural networks, while their theoretical analyses are missing. In this paper, we theoretically prove the effectiveness of such algorithms by explaining them as feature selection processes. We first specify a set of variables, named minimal stable variable set, that is the minimal and optimal set of variables to deal with covariate-shift generalization for common loss functions, such as the mean squared loss and binary cross-entropy loss. Afterward, we prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set. Analysis of asymptotic properties is also provided. These theories are further validated in several synthetic experiments.
翻译:共变性一般化是分配外(OOD)一般化的一个典型案例,它要求在未知的测试分布上表现良好,这不同于以共变式形式提供的无障碍培训分布。最近,稳定学习文献中由独立驱动的重要性加权算法显示,在处理若干学习模式的共变性一般化方面的经验效果,包括回归算法和深神经网络的理论分析,而它们缺乏理论分析。在本文中,我们理论上通过将这种算法解释为特征选择过程来证明这些算法的有效性。我们首先指定了一套变量,称为最低稳定变量集,这是处理共同损失函数的共变式一般化的最低和最佳变量集,例如平均平方损失和二元跨元素损失。之后,我们证明在理想条件下,由独立驱动的重要性加权算法可以确定这一组中的变量。也提供了对无特征特性的分析。这些理论在一些合成实验中得到了进一步验证。