Transfer learning is a machine learning paradigm where knowledge from one problem is utilized to solve a new but related problem. On the one hand, it is conceivable that knowledge from one task could be useful for solving a related task. On the other hand, it is also recognized that if not executed properly, transfer learning algorithms can in fact impair the learning performance instead of improving it - commonly known as negative transfer. In this paper, we study transfer learning from a Bayesian perspective, where a parametric statistical model is used. Specifically, we study three variants of transfer learning problems, instantaneous, online, and time-variant transfer learning. For each problem, we define an appropriate objective function, and provide either exact expressions or upper bounds on the learning performance using information-theoretic quantities, which allow simple and explicit characterizations when the sample size becomes large. Furthermore, examples show that the derived bounds are accurate even for small sample sizes. The obtained bounds give valuable insights on the effect of prior knowledge for transfer learning in our formulation. In particular, we formally characterize the conditions under which negative transfer occurs. Lastly, we devise two (online) transfer learning algorithms that are amenable to practical implementations. Specifically, one algorithm does not require the parametric assumption, thus extending our results to more general models. We demonstrate the effectiveness of our algorithms with real data set, especially when the source and target data have a strong similarity.
翻译:转移学习是一种机器学习模式,从一个问题的知识被用于解决一个新但相关的问题。一方面,可以想象,从一个任务获得的知识可能有助于解决相关任务。另一方面,人们还认识到,如果执行不当,转移学习算法实际上会损害学习绩效,而不是改进学习绩效 -- -- 通常称为负转移。在本文中,我们从一个巴伊西亚的角度研究转移学习,使用一个参数统计模型。具体地说,我们研究三个转移学习问题的变式,即即时、在线和时间差异性转移学习。对于每一个问题,我们确定一个适当的客观功能,并且用信息理论数量提供精确的表达或学习绩效的上限值,这样在抽样规模大时,可以简单和明确的描述学习绩效。此外,举例表明,即使对小样本规模而言,所得出的界限也是准确的。我们获得的界限使我们对先前知识的影响有了宝贵的洞察了解。特别是,我们正式描述了进行负面转移的条件。最后,我们为每个问题设计了两个(在线)转让的精确表达法或上限值,在样本大小时,我们需要更精确地展示一个数据模型时,我们所需要的实际结果。我们需要更精确地显示。