Changing how pre-trained models behave -- e.g., improving their performance on a downstream task or mitigating biases learned during pre-training -- is a common practice when developing machine learning systems. In this work, we propose a new paradigm for steering the behavior of neural networks, centered around \textit{task vectors}. A task vector specifies a direction in the weight space of a pre-trained model, such that movement in that direction improves performance on the task. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition, and the behavior of the resulting model is steered accordingly. Negating a task vector decreases performance on the target task, with little change in model behavior on control tasks. Moreover, adding task vectors together can improve performance on multiple tasks at once. Finally, when tasks are linked by an analogy relationship of the form ``A is to B as C is to D", combining task vectors from three of the tasks can improve performance on the fourth, even when no data from the fourth task is used for training. Overall, our experiments with several models, modalities and tasks show that task arithmetic is a simple, efficient and effective way of editing models.
翻译:改变预训练模型的行为,例如提高其在下游任务中的性能或减轻在预训练过程中学习的偏见,是开发机器学习系统时的常见做法。在本文中,我们提出了一种围绕“任务向量”中心的新范式,其中任务向量指定预训练模型的权重空间中的方向,以便在该方向上的移动改善任务上的性能。我们通过从微调任务后的相同模型的权重中减去预训练模型的权重来构建任务向量。我们展示了这些任务向量可以通过负号和加法等算术操作进行修改和组合,从而相应地控制模型的行为。取反任务向量会降低目标任务的性能,对控制任务的模型行为几乎没有影响。此外,将任务向量加在一起可以同时提高多个任务的性能。最后,当任务通过“A是B就像C是D”这种类比关系相互关联时,将来自三个任务的任务向量组合在一起可以提高第四个任务的性能,即使没有使用第四个任务的数据进行训练。总的来说,在几个模型、模态和任务的实验中,我们展示了任务算术是一种简单、高效且有效的模型编辑方法。