Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods always assume at least one of (i) source and target task label spaces overlap, (ii) source datasets are available, and (iii) target network architectures are consistent with source ones. However, holding these assumptions is difficult in practical settings because the target task rarely has the same labels as the source task, the source dataset access is restricted due to storage costs and privacy, and the target architecture is often specialized to each task. To transfer source knowledge without these assumptions, we propose a transfer learning method that uses deep generative models and is composed of the following two stages: pseudo pre-training (PP) and pseudo semi-supervised learning (P-SSL). PP trains a target architecture with an artificial dataset synthesized by using conditional source generative models. P-SSL applies SSL algorithms to labeled target data and unlabeled pseudo samples, which are generated by cascading the source classifier and generative models to condition them with target samples. Our experimental results indicate that our method can outperform the baselines of scratch training and knowledge distillation.
翻译:在培训关于新目标任务的深线神经网络时,传授学习至关重要。目前的传授学习方法总是假定至少(一) 源和目标任务标签空间重叠,(二) 源数据集可供使用,(三) 目标网络结构与源代码结构一致。然而,在实际环境中,要保持这些假设是困难的,因为目标任务很少与源任务有相同的标签,源数据集访问受存储成本和隐私的限制,而且目标结构往往专门针对每项任务。为了在不假定这些假设的情况下转让源知识,我们建议采用一种转让学习方法,使用深层基因化模型,由以下两个阶段组成:假预培训(PP)和假半监督学习(P-SSL)。 PP 培训一个目标结构,使用由有条件来源基因化模型合成的人工数据集。 P-SSL应用SL 算法将目标数据和未标注的伪样本标定为目标数据,这些模型是利用源分类和基因化模型来为目标样本设定的。我们的实验结果表明,我们的方法可以超越刮痕培训和知识蒸馏的基线。