We present a new model and methods for the posterior drift problem where the regression function in the target domain is modeled as a linear adjustment (on an appropriate scale) of that in the source domain, an idea that inherits the simplicity and the usefulness of generalized linear models and accelerated failure time models from the classical statistics literature, and study the theoretical properties of our proposed estimator in the binary classification problem. Our approach is shown to be flexible and applicable in a variety of statistical settings, and can be adopted to transfer learning problems in various domains including epidemiology, genetics and biomedicine. As a concrete application, we illustrate the power of our approach through mortality prediction for British Asians by borrowing strength from similar data from the larger pool of British Caucasians, using the UK Biobank data.
翻译:我们为后漂流问题提出了一个新的模式和方法,即目标领域回归功能作为源域回归功能的线性调整(适当规模)的模式,从古典统计文献中继承通用线性模型和加速失效时间模型的简单性和实用性,并研究我们在二进制分类问题中拟议估算员的理论属性。我们的方法在各种统计环境中都表现出灵活性和适用性,并可用于转移包括流行病学、遗传学和生物医学在内的各个领域的学习问题。作为一个具体应用,我们从英国生物银行的数据中借用了来自英国大型高加索人的类似数据,以此来说明我们通过预测英裔亚洲人的死亡率的方法的力量。