通过迭代共识将未经培训的模型集成成 (Composing Ensembles of Pre-trained Models via Iterative Consensus)

Large pre-trained models exhibit distinct and complementary capabilities dependent on the data they are trained on. Language models such as GPT-3 are capable of textual reasoning but cannot understand visual information, while vision models such as DALL-E can generate photorealistic photos but fail to understand complex language descriptions. In this work, we propose a unified framework for composing ensembles of different pre-trained models -- combining the strengths of each individual model to solve various multimodal problems in a zero-shot manner. We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization. The generator constructs proposals and the scorers iteratively provide feedback to refine the generated result. Such closed-loop communication enables models to correct errors caused by other models, significantly boosting performance on downstream tasks, e.g. improving accuracy on grade school math problems by 7.5%, without requiring any model finetuning. We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer, by leveraging the strengths of each expert model. Results show that the proposed method can be used as a general purpose framework for a wide range of zero-shot multimodal tasks, such as image generation, video question answering, mathematical reasoning, and robotic manipulation. Project page: https://energy-based-model.github.io/composing-pretrained-models.

翻译：在这项工作中,我们提出一个统一框架,将不同的预先培训模型组合成不同的组合 -- -- 将每个单个模型的长处结合起来,以零发方式解决各种多式联运问题。我们使用预先培训模型,作为“管理员”或“选手”,并通过封闭式迭接式共识优化,形成单一分数的反馈。发电机建构建议和计分器反复提供反馈,以完善产生的结果。这种闭路通信使模型能够纠正其他模型造成的错误,大大提升下游任务的业绩,例如,提高学校数学问题等级的精确度,提高7.5%,而不需要任何模型微调。我们证明,通过一组得分器的组合,超越了单一分数模型的反馈,利用每个专家模型的优势。结果显示,拟议的方法可以用来纠正其他模型造成的错误,大大提升下游任务的业绩,例如,提高学校数学问题的精确度,而不需要做任何模型微调。我们证明,通过一组得分者所达成的共识,超越了单一分数的模型的反馈,可以利用每个专家模型的优势。结果显示,拟议的方法可以用来作为通用的模型,并用来解释。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/