AFA-LoRA：通过激活函数退火在LoRA中实现非线性自适应 (AFA-LoRA: Enabling Non-Linear Adaptations in LoRA with Activation Function Annealing)

Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method. However, its linear adaptation process limits its expressive power. This means there is a gap between the expressive power of linear training and non-linear training. To bridge this gap, we propose AFA-LoRA, a novel training strategy that brings non-linear expressivity to LoRA while maintaining its seamless mergeability. Our key innovation is an annealed activation function that transitions from a non-linear to a linear transformation during training, allowing the adapter to initially adopt stronger representational capabilities before converging to a mergeable linear form. We implement our method on supervised fine-tuning, reinforcement learning, and speculative decoding. The results show that AFA-LoRA reduces the performance gap between LoRA and full-parameter training. This work enables a more powerful and practical paradigm of parameter-efficient adaptation.

翻译：低秩自适应（LoRA）是一种广泛采用的参数高效微调方法。然而，其线性自适应过程限制了其表达能力，这意味着线性训练与非线性训练的表达能力之间存在差距。为弥合这一差距，我们提出了AFA-LoRA，这是一种新颖的训练策略，在保持其无缝可合并性的同时，为LoRA引入了非线性表达能力。我们的核心创新是一种退火激活函数，该函数在训练过程中从非线性变换过渡到线性变换，使适配器能够先具备更强的表征能力，最终收敛为可合并的线性形式。我们在监督微调、强化学习和推测解码任务上实现了该方法。实验结果表明，AFA-LoRA缩小了LoRA与全参数训练之间的性能差距。这项工作为参数高效自适应提供了一种更强大且实用的范式。