Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain rich and diverse features. In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks. Specifically, we repurpose these auxiliary weights as initializations for multiple parallel fine-tunings on the target task; then, we average all fine-tuned weights to obtain the final model. This recycling strategy aims at maximizing the diversity in weights by leveraging the diversity in auxiliary tasks. Empirically, it improves the state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, this work contributes to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to reliably update machine learning models.
翻译:基础模型正在重新定义如何构建 AI 系统。 实践者现在遵循标准程序来构建他们的机器学习解决方案: 从经过培训的基础模型中, 他们微调了目标任务重的权重。 因此, 互联网被一些基础模型围绕许多不同任务进行微调的调整: 这些个人微调是孤立存在的,没有相互受益。 我们认为, 这是一个错失的机会, 因为这些专门模型包含丰富多样的特点。 因此, 我们在此文件中提出了模型 atattuille, 这是一种重新循环同一基础模型对不同辅助任务进行多重微调的新战略。 具体地说, 我们将这些辅助权重重新定位为目标任务多重平行微调的初始化; 然后, 我们平均使用微调的权重来获取最终模型。 这个回收战略的目的是通过在辅助任务中利用多样性来最大限度地实现权重的多样性。 随机性地, 它改进了参考 DomainBed 扩展常规化基准的艺术状态。 向前看, 这项工作有助于正在形成的新模式, 将这些辅助权重权重作为初始化的模型, 学习一个稳定的机器发展。