枢纽路:从培训前模式中心学习的转移 (Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models)

Transfer learning aims to leverage knowledge from pre-trained models to benefit the target task. Prior transfer learning work mainly transfers from a single model. However, with the emergence of deep models pre-trained from different resources, model hubs consisting of diverse models with various architectures, pre-trained datasets and learning paradigms are available. Directly applying single-model transfer learning methods to each model wastes the abundant knowledge of the model hub and suffers from high computational cost. In this paper, we propose a Hub-Pathway framework to enable knowledge transfer from a model hub. The framework generates data-dependent pathway weights, based on which we assign the pathway routes at the input level to decide which pre-trained models are activated and passed through, and then set the pathway aggregation at the output level to aggregate the knowledge from different models to make predictions. The proposed framework can be trained end-to-end with the target task-specific loss, where it learns to explore better pathway configurations and exploit the knowledge in pre-trained models for each target datum. We utilize a noisy pathway generator and design an exploration loss to further explore different pathways throughout the model hub. To fully exploit the knowledge in pre-trained models, each model is further trained by specific data that activate it, which ensures its performance and enhances knowledge transfer. Experiment results on computer vision and reinforcement learning tasks demonstrate that the proposed Hub-Pathway framework achieves the state-of-the-art performance for model hub transfer learning.

翻译：先前的转移学习工作主要是从一个单一的模式中转让数据。然而,随着由不同资源预先培训的深层次模型的出现,可以建立由不同结构、经过预先培训的数据集和学习范式的不同模型组成的模型中心。对每个模型直接应用单一模式转让学习方法,浪费了对模型中心的丰富知识,并造成高昂的计算成本。在本文件中,我们提议了一个枢纽-轨道框架,以便能够从一个模型枢纽转移知识。该框架产生了依赖数据的路径权重,据此,我们分配了投入一级的路径,以确定哪些经过预先培训的模型被激活和通过,然后在产出一级设置路径集合,将不同模型的知识汇总起来作出预测。拟议的框架可以随着具体任务的损失而培训最终到最后,在其中我们学习了更好的路径配置和利用了每个目标基准的事先培训模式中的知识。我们利用了一种冷却的路径生成器,设计了一种探索损失,以进一步探索整个模型中的不同路径。充分利用了每个经过培训的路径,从而强化了每个经过培训的计算机核心的学习成果。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日