Large Language Models (LLMs) fine-tuned for specific domains exhibit strong performance; however, the underlying mechanisms by which this fine-tuning reshapes their parametric space are not well understood. Prior works primarily focus on auto-regressive or general-purpose instruct models, leaving domain-specialised LLMs under-explored. We present the first systematic study of domain-specific fine-tuning in large medical language models. Our analysis reveals that fine-tuning modifies only a small subset of the representational subspace, essentially preserving the pre-trained model's representation. To interpret these changes in subspaces, we propose tuning vectors, a novel framework inspired by task vectors, which explicitly capture the directional parameter shifts induced by fine-tuning. We demonstrate that these vectors are critical for enhancing both instruction-following and generation quality. Furthermore, combining tuning vectors across different domains yields improved generalisation. Upon closer inspection of directional alignment, we find these vectors primarily write new directional information into the MLP layers of the model, while amplifying existing directions in attention heads. Our findings offer new insights into LLM adaptation and provide a general, interpretable framework for analysing specialisation in large language models.
翻译:针对特定领域微调的大型语言模型(LLMs)展现出卓越性能;然而,这种微调重塑其参数空间的内在机制尚未得到充分理解。先前的研究主要集中于自回归或通用指令模型,对领域专业化LLMs的探索相对不足。我们首次对大型医学语言模型中的领域特定微调进行了系统性研究。分析表明,微调仅修改了表征子空间的一小部分,本质上保留了预训练模型的表征结构。为解释这些子空间变化,我们提出了调优向量——一种受任务向量启发的新型框架,它能显式捕捉微调引发的定向参数偏移。我们证明这些向量对于提升指令遵循能力和生成质量至关重要。此外,跨领域组合调优向量可产生更好的泛化性能。通过对方向对齐的深入考察,我们发现这些向量主要在模型的MLP层中写入新的方向信息,同时放大注意力头中已有的方向特征。我们的研究结果为LLM适应机制提供了新的见解,并为分析大型语言模型的专业化过程提出了一个通用且可解释的框架。