With the ever-growing size of pre-trained models (PMs), fine-tuning them has become more expensive and resource-hungry. As a remedy, low-rank adapters (LoRA) keep the main pre-trained weights of the model frozen and just introduce some learnable truncated SVD modules (so-called LoRA blocks) to the model. While LoRA blocks are parameter efficient, they suffer from two major problems: first, the size of these blocks is fixed and cannot be modified after training (for example, if we need to change the rank of LoRA blocks, then we need to re-train them from scratch); second, optimizing their rank requires an exhaustive search and effort. In this work, we introduce a dynamic low-rank adaptation (DyLoRA) technique to address these two problems together. Our DyLoRA method trains LoRA blocks for a range of ranks instead of a single rank by sorting out the representation learned by the adapter module at different ranks during training. We evaluate our solution on different tasks of the GLUE benchmark using the RoBERTa model. Our results show that we can train dynamic search-free models with DyLoRA at least $7\times$ faster than LoRA without significantly compromising performance. Moreover, our models can perform consistently well on a much larger range of ranks compared to LoRA.
翻译:随着经过训练的模型(PM)规模不断扩大,微调这些模型变得越来越昂贵,资源也更加缺乏。作为一种补救措施,低级别适应者(LORA)保持模型主要经训练的重量,将模型的主要经训练的重量冻结起来,只是向模型引入一些可学习的短程SVD模块(所谓的LORA块),虽然LORA区块是高效的参数,但它们有两大问题:第一,这些区块的大小是固定的,在培训后无法修改(例如,如果我们需要改变LORA区块的等级,那么我们需要从零开始重新训练它们;第二,优化它们的级别需要彻底的搜索和努力。在这项工作中,我们引入了动态低级别适应(DyLORA)技术,以共同解决这两个问题。我们的DYLORA方法将LORA区块的等级排成一排,而不是单级,方法是将不同级别适应者模块所学到的代表进行分类。我们用ROERTA模型来评估我们关于GLUE基准不同任务的解决方案的解决方案的解决方案。我们的成果可以比最慢得多地进行搜索模型,比我们最慢得多的模型比较。