In this work, we study the transfer learning problem under high-dimensional generalized linear models (GLMs), which aim to improve the fit on target data by borrowing information from useful source data. Given which sources to transfer, we propose an oracle algorithm and derive its $\ell_2$-estimation error bounds. The theoretical analysis shows that under certain conditions, when the target and source are sufficiently close to each other, the estimation error bound could be improved over that of the classical penalized estimator using only target data. When we don't know which sources to transfer, an algorithm-free transferable source detection approach is introduced to detect informative sources. The detection consistency is proved under the high-dimensional GLM transfer learning setting. Extensive simulations and a real-data experiment verify the effectiveness of our algorithms.
翻译:在这项工作中,我们研究了高维通用线性模型(GLM)下的转移学习问题,该模型旨在通过从有用的源数据中借用信息来改进目标数据的适切性。根据哪些来源可以转让,我们建议一个神器算法,并得出其$\ell_2美元的估计误差界限。理论分析表明,在某些条件下,当目标和来源彼此足够接近时,与仅使用目标数据的经典惩罚性估计值相比,估计误差可以改进。当我们不知道哪些来源可以转让时,将采用一种不使用算法的可转移源探测方法来探测信息源。检测一致性在高维的GLM传输学习环境中得到证明。广泛的模拟和真实数据实验可以验证我们的算法的有效性。