We study linear regression under covariate shift, where the marginal distribution over the input covariates differs in the source and the target domains, while the conditional distribution of the output given the input covariates is similar across the two domains. We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data (both conducted by online SGD) for this problem. We establish sharp instance-dependent excess risk upper and lower bounds for this approach. Our bounds suggest that for a large class of linear regression instances, transfer learning with $O(N^2)$ source data (and scarce or no target data) is as effective as supervised learning with $N$ target data. In addition, we show that finetuning, even with only a small amount of target data, could drastically reduce the amount of source data required by pretraining. Our theory sheds light on the effectiveness and limitation of pretraining as well as the benefits of finetuning for tackling covariate shift problems.
翻译:在共变式转变下,我们研究线性回归,在源和目标领域,输入共变的边际分布不同,而输入共变的附带产出的有条件分布在两个领域类似。我们调查一种转让学习方法,对源数据进行预先培训,并根据目标数据(由在线 SGD 进行)微调这一问题。我们为这一方法确立了明显依赖实例的超大风险上限和下限。我们的界限表明,对于一大类线性回归案例来说,用$O(N)2)的源数据(和稀缺或没有目标数据)进行转移学习,与用$N美元的目标数据进行监管学习一样有效。此外,我们显示微调,即使只有少量目标数据,也能大幅降低培训前所需的源数据数量。我们的理论揭示了培训前培训的有效性和局限性,以及用美元源数据(以及稀缺或没有目标数据)进行微调解决共变换问题的好处。