In transfer learning, we wish to make inference about a target population when we have access to data both from the distribution itself, and from a different but related source distribution. We introduce a flexible framework for transfer learning in the context of binary classification, allowing for covariate-dependent relationships between the source and target distributions that are not required to preserve the Bayes decision boundary. Our main contributions are to derive the minimax optimal rates of convergence (up to poly-logarithmic factors) in this problem, and show that the optimal rate can be achieved by an algorithm that adapts to key aspects of the unknown transfer relationship, as well as the smoothness and tail parameters of our distributional classes. This optimal rate turns out to have several regimes, depending on the interplay between the relative sample sizes and the strength of the transfer relationship, and our algorithm achieves optimality by careful, decision tree-based calibration of local nearest-neighbour procedures.
翻译:在转移学习中,我们希望在获得分布本身和不同但相关的来源分布的数据时,对目标人口作出推断。我们引入了一个灵活的框架,在二进制分类中进行转移学习,允许源和目标分布之间具有共变性的关系,而这种关系对于维护巴耶斯决定边界并不必要。我们的主要贡献是在这个问题上得出最小最大最佳融合率(直至多对数因素 ), 并表明最佳比率可以通过一种算法实现,该算法能够适应未知转移关系的关键方面,以及我们分布等级的顺畅和尾部参数。 这一最佳比率有几种制度,取决于相对抽样大小和转移关系强度之间的相互作用,以及我们的方法通过谨慎、基于树的当地近邻程序决策校准实现最佳性。