Raatatouille模型: " 再循环 " 等离散通用的多种模式 (Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization)

Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain rich and diverse features. In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks. Specifically, we repurpose these auxiliary weights as initializations for multiple parallel fine-tunings on the target task; then, we average all fine-tuned weights to obtain the final model. This recycling strategy aims at maximizing the diversity in weights by leveraging the diversity in auxiliary tasks. Empirically, it improves the state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, this work contributes to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to reliably update machine learning models.

翻译：基础模型正在重新定义如何构建 AI 系统。实践者现在遵循标准程序来构建他们的机器学习解决方案: 从经过培训的基础模型中, 他们微调了目标任务重的权重。因此, 互联网被一些基础模型围绕许多不同任务进行微调的调整: 这些个人微调是孤立存在的,没有相互受益。我们认为, 这是一个错失的机会, 因为这些专门模型包含丰富多样的特点。因此, 我们在此文件中提出了模型 atattuille, 这是一种重新循环同一基础模型对不同辅助任务进行多重微调的新战略。具体地说, 我们将这些辅助权重重新定位为目标任务多重平行微调的初始化; 然后, 我们平均使用微调的权重来获取最终模型。这个回收战略的目的是通过在辅助任务中利用多样性来最大限度地实现权重的多样性。随机性地, 它改进了参考 DomainBed 扩展常规化基准的艺术状态。向前看, 这项工作有助于正在形成的新模式, 将这些辅助权重权重作为初始化的模型, 学习一个稳定的机器发展。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日