Uni-Perceiver-MoE:学习有条件的部的简单通用主义模式 (Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs) - 专知论文

会员服务 ·

0

Learning · MoDELS · 泛化理论 · 视频描述生成（Video Caption） · 稀疏 ·

2022 年 6 月 9 日

Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

翻译：Uni-Perceiver-MoE:学习有条件的部的简单通用主义模式

Jinguo Zhu,Xizhou Zhu,Wenhai Wang,Xiaohua Wang,Hongsheng Li,Xiaogang Wang,Jifeng Dai

To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., video-text retrieval and video caption. Code and pre-trained generalist models shall be released.

翻译：为了建立生物情报系统等人工神经网络,最近的工程将许多任务统一成一个通用模型,该模型可以处理各种任务,具有共同参数,而且没有任何特定任务模块。虽然通用模型在各种基准上取得了有希望的成果,但与任务专门模型相比,这些模型在某些任务上的表现却不如任务专门模型。在这项工作中,我们发现不同任务和模式之间的干扰是这一现象的主要因素。为了减轻这种干扰,我们引入了通用模型,可以处理各种任务,这种模型可以处理不同参数的不同任务,而没有任何特定任务模块。提议在不同条件下制定战略,以兼顾培训/推断成本和一般化能力。通过纳入拟议的通用模型,最近提议的通用模型Uniercer能够有效减轻任务和模式之间的干扰,并通过对下游数据1%的迅速调整,在一系列下游任务上取得最先进的结果。此外,引入通用模型仍然具备在各种条件下进行培训/推断成本和一般能力,从而将通用模型的通用能力纳入考虑之中。通过纳入拟议的有条件模型,最近提出的通用模型可以有效减轻任务和模式之间的干扰,从而在一系列下游任务上取得最先进的状态结果结果。关于新的录像模型和升级的版本。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Underlay频谱共享方式下信号参数估计和调制识别的方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

液晶动力学系统部分正则性和适定性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于深度神经网络的噪声鲁棒性语音识别方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

不确定耦合PDE-ODE系统的自适应镇定

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

催化型氮杂Wittig反应合成多取代杂环的新方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多维联合分布理论的沙尘暴风险评估Coupla模型研究：以内蒙古中部为例

国家自然科学基金

0+阅读 · 2011年12月31日

Contrastive Learning for Interactive Recommendation in Fashion

Arxiv

0+阅读 · 2022年7月25日

Knowledge-Grounded Conversational Data Augmentation with Generative Conversational Networks

Arxiv

0+阅读 · 2022年7月22日

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Arxiv

0+阅读 · 2022年7月21日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

OntoZSL: Ontology-enhanced Zero-shot Learning

Arxiv

17+阅读 · 2021年2月15日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

VIP会员

文章信息

相关主题

视频描述生成（Video Caption）

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

AAAI 2026 | 自动化所新作速览（一）

《基于回归估计、附带损伤优化与发射集成的武器效能评估（RECOIL）系统》

AAAI 2026 | 自动化所新作速览（二）

智能体化人工智能：体系结构、应用与未来方向的综合综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Contrastive Learning for Interactive Recommendation in Fashion

Arxiv

0+阅读 · 2022年7月25日

Knowledge-Grounded Conversational Data Augmentation with Generative Conversational Networks

Arxiv

0+阅读 · 2022年7月22日

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Arxiv

0+阅读 · 2022年7月21日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

OntoZSL: Ontology-enhanced Zero-shot Learning

Arxiv

17+阅读 · 2021年2月15日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

相关基金

Underlay频谱共享方式下信号参数估计和调制识别的方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

液晶动力学系统部分正则性和适定性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于深度神经网络的噪声鲁棒性语音识别方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

不确定耦合PDE-ODE系统的自适应镇定

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

催化型氮杂Wittig反应合成多取代杂环的新方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多维联合分布理论的沙尘暴风险评估Coupla模型研究：以内蒙古中部为例

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员