Feature-based transfer is one of the most effective methodologies for transfer learning. Existing studies usually assume that the learned new feature representation is \emph{domain-invariant}, and thus train a transfer model $\mathcal{M}$ on the source domain. In this paper, we consider a more realistic scenario where the new feature representation is suboptimal and small divergence still exists across domains. We propose a new transfer model called Randomized Transferable Machine (RTM) to handle such small divergence of domains. Specifically, we work on the new source and target data learned from existing feature-based transfer methods. The key idea is to enlarge source training data populations by randomly corrupting the new source data using some noises, and then train a transfer model $\widetilde{\mathcal{M}}$ that performs well on all the corrupted source data populations. In principle, the more corruptions are made, the higher the probability of the new target data can be covered by the constructed source data populations, and thus better transfer performance can be achieved by $\widetilde{\mathcal{M}}$. An ideal case is with infinite corruptions, which however is infeasible in reality. We develop a marginalized solution that enables to train an $\widetilde{\mathcal{M}}$ without conducting any corruption but equivalent to be trained using infinite source noisy data populations. We further propose two instantiations of $\widetilde{\mathcal{M}}$, which theoretically show the transfer superiority over the conventional transfer model $\mathcal{M}$. More importantly, both instantiations have closed-form solutions, leading to a fast and efficient training process. Experiments on various real-world transfer tasks show that RTM is a promising transfer model.
翻译:基于地貌的传输是最有效的传输学习方法之一。 现有的研究通常假定, 所学的新特性显示为 \ emph{ domain- in variant}, 从而在源域内训练一个传输模型 $\ mathcal{M} $。 在本文中, 我们考虑一个更现实的假设, 新特性表示为不尽人意, 各域之间仍然存在着小的差别。 我们建议一个新的传输模式, 叫做随机可转移机器( RTM), 以处理如此小的域间差异 。 具体地说, 我们研究从现有基于地域的传输方法中学到的新来源和目标数据。 关键的想法是通过使用一些噪音来随机地腐蚀新来源的数据, 来扩大源培训数据数量。 最理想的转移方式是快速地进行。 然而, 超常规的转移模式越多,新目标数据的概率越高, 也越有可能通过全局化的方法实现。