Changing how pre-trained models behave -- e.g., improving their performance on a downstream task or mitigating biases learned during pre-training -- is a common practice when developing machine learning systems. In this work, we propose a new paradigm for steering the behavior of neural networks, centered around \textit{task vectors}. A task vector specifies a direction in the weight space of a pre-trained model, such that movement in that direction improves performance on the task. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition, and the behavior of the resulting model is steered accordingly. Negating a task vector decreases performance on the target task, with little change in model behavior on control tasks. Moreover, adding task vectors together can improve performance on multiple tasks at once. Finally, when tasks are linked by an analogy relationship of the form ``A is to B as C is to D", combining task vectors from three of the tasks can improve performance on the fourth, even when no data from the fourth task is used for training. Overall, our experiments with several models, modalities and tasks show that task arithmetic is a simple, efficient and effective way of editing models.
翻译:改变预训练模型的行为——例如提高其在下游任务中的表现或缓解预训练中学习到的偏差——是机器学习系统开发中的常见做法。在本文中,我们提出了一种围绕“任务向量”展开的新范式来控制神经网络的行为。任务向量指定了一个方向,在预训练模型的权重空间中,使得沿着该方向移动可以改善任务上的表现。我们通过将预训练模型的权重与在任务上微调后的模型的权重相减来构建任务向量。我们展示了这些任务向量可以通过取反和加法等算术操作进行修改和组合,进而调整模型的行为。取反任务向量会降低目标任务的表现,但对于控制任务的模型行为几乎没有影响。此外,合并任务向量可以同时提高多个任务的表现。最后,当任务之间具有“A与B的关系就像C与D的关系一样”的类比关系时,从三个任务中组合任务向量可以提高第四个任务的表现,即使没有使用第四个任务的数据进行训练。总体而言,我们使用几个模型、模态和任务进行的实验表明,编辑任务算法是一种简单、高效且有效的模型编辑方法。