分配语言模型人工人工询问 (Artificial Interrogation for Attributing Language Models)

This paper presents solutions to the Machine Learning Model Attribution challenge (MLMAC) collectively organized by MITRE, Microsoft, Schmidt-Futures, Robust-Intelligence, Lincoln-Network, and Huggingface community. The challenge provides twelve open-sourced base versions of popular language models developed by well-known organizations and twelve fine-tuned language models for text generation. The names and architecture details of fine-tuned models were kept hidden, and participants can access these models only through the rest APIs developed by the organizers. Given these constraints, the goal of the contest is to identify which fine-tuned models originated from which base model. To solve this challenge, we have assumed that fine-tuned models and their corresponding base versions must share a similar vocabulary set with a matching syntactical writing style that resonates in their generated outputs. Our strategy is to develop a set of queries to interrogate base and fine-tuned models. And then perform one-to-many pairing between them based on similarities in their generated responses, where more than one fine-tuned model can pair with a base model but not vice-versa. We have employed four distinct approaches for measuring the resemblance between the responses generated from the models of both sets. The first approach uses evaluation metrics of the machine translation, and the second uses a vector space model. The third approach uses state-of-the-art multi-class text classification, Transformer models. Lastly, the fourth approach uses a set of Transformer based binary text classifiers, one for each provided base model, to perform multi-class text classification in a one-vs-all fashion. This paper reports implementation details, comparison, and experimental studies, of these approaches along with the final obtained results.

翻译：本文介绍了由麻省理工学院、微软、施密特-未来、强力-智能、林肯-Network和Huggingface社区集体组织的机器学习模型归因挑战(MLMMAC)的解决方案。挑战提供了由知名组织开发的12个开放源基版本的流行语言模型和12个微调文本生成语言模型。微调模型的名称和结构细节被隐藏起来, 参与者只能通过组织者开发的其余 API 访问这些模型。鉴于这些制约因素, 竞赛的目标是确定哪些经过微调的模型起源于哪些基础模型。为了应对这一挑战, 我们假设经过微调的模型及其相应的基础版本必须共享一个类似的词汇组, 匹配的合成合成语言模式书写风格在生成的输出中产生共鸣。我们的战略是开发一套询问基础基础基础基础和结构细节和微调模型的模型细节。然后根据所生成的版本的对应方法进行一对称, 其中不止一个经过微调的分类模型可以与基础模型相匹配,而不是副反转的版本。我们用四种不同的方法来测量模型的版本的版本。我们用一种基础的模型的模型的版本的版本的版本的版本的版本的版本, 使用了一种模型的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本, 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/