Many of language models' impressive capabilities originate from their in-context learning: based on instructions or examples, they can infer and perform new tasks without weight updates. In this work, we investigate when representations for new tasks are formed in language models, and how these representations change over the course of context. We study two different task representations: those that are ''transferrable'' -- vector representations that can transfer task contexts to another model instance, even without the full prompt -- and simpler representations of high-level task categories. We show that transferrable task representations evolve in non-monotonic and sporadic ways, while task identity representations persist throughout the context. Specifically, transferrable task representations exhibit a two-fold locality. They successfully condense evidence when more examples are provided in the context. But this evidence accrual process exhibits strong temporal locality along the sequence dimension, coming online only at certain tokens -- despite task identity being reliably decodable throughout the context. In some cases, transferrable task representations also show semantic locality, capturing a small task ''scope'' such as an independent subtask. Language models thus represent new tasks on the fly through both an inert, sustained sensitivity to the task and an active, just-in-time representation to support inference.
翻译:语言模型的诸多卓越能力源于其上下文学习特性:基于指令或示例,它们无需权重更新即可推断并执行新任务。本研究探讨了语言模型中新任务表征的形成时机及其在上下文过程中的演变方式。我们研究了两种不同的任务表征:一种是“可迁移”表征——即能够将任务上下文转移至另一模型实例的向量表征(即使不包含完整提示);另一种是更简单的高层任务类别表征。研究表明,可迁移任务表征以非单调、间歇性的方式演化,而任务身份表征则在上下文中持续存在。具体而言,可迁移任务表征呈现双重局部性特征:当上下文提供更多示例时,它们能成功整合证据;但这一证据累积过程在序列维度上表现出强烈的时间局部性,仅在某些特定标记处激活——尽管任务身份在整个上下文中均可被可靠解码。在某些情况下,可迁移任务表征还表现出语义局部性,仅捕获较小的任务“范围”(如独立子任务)。因此,语言模型通过两种机制动态表征新任务:一是对任务保持持续但惰性的敏感度,二是为支持推理而构建的即时活跃表征。