Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods generally assume at least one of (i) source and target task label spaces must overlap, (ii) source datasets are available, and (iii) target network architectures are consistent with source ones. However, these all assumptions are difficult to hold in practical settings because the target task rarely has the same labels as the source task, the source dataset access is restricted due to licensing and storage costs, and the target architecture is often specialized to each task. To transfer source knowledge without these assumptions, we propose a transfer learning method that uses deep generative models and is composed of the following two stages: pseudo pre-training (PP) and pseudo semi-supervised learning (P-SSL). PP trains a target architecture with a synthesized dataset by using conditional source generative models. P-SSL applies SSL algorithms to labeled target data and unlabeled pseudo samples, which are generated by cascading the source classifier and generative models to condition them with target samples. Our experimental results indicate that our method can outperform baselines of scratch training and knowledge distillation.
翻译:在培训关于新目标任务的深神经网络时,传授学习至关重要。目前的传授学习方法通常假定至少(一) 源和目标任务标签空间之一必须重叠,(二) 源数据集和(三) 目标网络结构与源数据集一致。然而,所有这些假设都难以在实际环境中坚持,因为目标任务很少与源任务有相同的标签,源数据集访问因许可证和储存成本而受到限制,目标结构往往专门为每项任务。为了在不作这些假设的情况下转让源知识,我们提议采用一种转让学习方法,使用深基因模型,由以下两个阶段组成:假的预培训(PP)和假的半监督学习(P-SSL)。 PP通过使用有条件源基因化模型,用综合数据来培训目标结构。 P-SSL将SL算法用于标签目标数据和无标签伪样本,这些样本是源分类和归比模型生成的,我们的方法可以超过抓培训和知识蒸馏的基线。我们的实验结果表明,我们的方法可以超越抓培训和知识蒸馏。