双重稳健协变量偏移适应的学习边界 (Learning bounds for doubly-robust covariate shift adaptation)

Distribution shift between the training domain and the test domain poses a key challenge for modern machine learning. An extensively studied instance is the \emph{covariate shift}, where the marginal distribution of covariates differs across domains, while the conditional distribution of outcome remains the same. The doubly-robust (DR) estimator, recently introduced by \cite{kato2023double}, combines the density ratio estimation with a pilot regression model and demonstrates asymptotic normality and $\sqrt{n}$-consistency, even when the pilot estimates converge slowly. However, the prior arts has focused exclusively on deriving asymptotic results and has left open the question of non-asymptotic guarantees for the DR estimator. This paper establishes the first non-asymptotic learning bounds for the DR covariate shift adaptation. Our main contributions are two-fold: (\romannumeral 1) We establish \emph{structure-agnostic} high-probability upper bounds on the excess target risk of the DR estimator that depend only on the $L^2$-errors of the pilot estimates and the Rademacher complexity of the model class, without assuming specific procedures to obtain the pilot estimate, and (\romannumeral 2) under \emph{well-specified parameterized models}, we analyze the DR covariate shift adaptation based on modern techniques for non-asymptotic analysis of MLE, whose key terms governed by the Fisher information mismatch term between the source and target distributions. Together, these findings bridge asymptotic efficiency properties and a finite-sample out-of-distribution generalization bounds, providing a comprehensive theoretical underpinnings for the DR covariate shift adaptation.

翻译：训练域与测试域之间的分布偏移是现代机器学习面临的关键挑战。其中，\emph{协变量偏移}是一个被广泛研究的实例，即协变量的边缘分布在域间存在差异，而结果的条件分布保持不变。\cite{kato2023double}近期提出的双重稳健（DR）估计器将密度比估计与先导回归模型相结合，即使先导估计收敛缓慢，仍展现出渐近正态性和$\sqrt{n}$-一致性。然而，现有研究主要聚焦于推导渐近结果，尚未解决DR估计器的非渐近保证问题。本文首次建立了DR协变量偏移适应的非渐近学习边界。我们的主要贡献包括：（一）在\emph{结构无关}的设定下，建立了DR估计器在目标风险超额值上的高概率上界，该上界仅依赖于先导估计的$L^2$误差和模型类的Rademacher复杂度，无需假设获取先导估计的具体过程；（二）在\emph{参数模型正确设定}的条件下，基于极大似然估计非渐近分析的现代技术，分析了DR协变量偏移适应，其关键项由源分布与目标分布间的Fisher信息失配项主导。这些发现共同连接了渐近效率性质与有限样本分布外泛化边界，为DR协变量偏移适应提供了完整的理论基础。