A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution. However, such an assumption is often violated in the real world due to non-stationarity of the environment or bias in sample selection. In this work, we consider a prevalent setting called covariate shift, where the input distribution differs between the training and test stages while the conditional distribution of the output given the input remains unchanged. Most of the existing methods for covariate shift adaptation are two-step approaches, which first calculate the importance weights and then conduct importance-weighted empirical risk minimization. In this paper, we propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization by minimizing an upper bound of the test risk. We theoretically analyze the proposed method and provide a generalization error bound. We also empirically demonstrate the effectiveness of the proposed method.
翻译:许多机床学习设想中默认的假设是,培训和测试样本是从相同概率分布中抽取的,然而,这种假设在现实世界中常常被违反,原因是环境不常态或抽样选择中存在偏差。在这项工作中,我们考虑到一种流行的环境,即共变式转变,即输入分布在培训和测试阶段之间有所不同,而附带投入的产出的有条件分布保持不变。现有的共变转移适应方法大多是两步方法,首先计算重要重量,然后进行重要加权经验风险最小化。在本文中,我们提出一种新的一步方法,通过尽量减少试验风险的上限,共同学习预测模型和相关加权值。我们从理论上分析拟议的方法,并提供一个总化错误。我们还以经验方式展示了拟议方法的有效性。