Visual Parameter-Efficient Tuning (VPET) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty. However, existing VPET methods introduce trainable parameters to the same positions across different tasks depending solely on human heuristics and neglect the domain gaps. To this end, we study where to introduce and how to allocate trainable parameters by proposing a novel Sensitivity-aware visual Parameter-efficient Tuning (SPT) scheme, which adaptively allocates trainable parameters to task-specific important positions given a desired tunable parameter budget. Specifically, our SPT first quickly identifies the sensitive parameters that require tuning for a given task in a data-dependent way. Next, our SPT further boosts the representational capability for the weight matrices whose number of sensitive parameters exceeds a pre-defined threshold by utilizing any of the existing structured tuning methods, e.g., LoRA or Adapter, to replace directly tuning the selected sensitive parameters (unstructured tuning) under the budget. Extensive experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing VPET methods and largely boosts their performance, e.g., SPT improves Adapter with supervised pre-trained ViT-B/16 backbone by 4.2% and 1.4% mean Top-1 accuracy, reaching SOTA performance on FGVC and VTAB-1k benchmarks, respectively. Source code is at https://github.com/ziplab/SPT
翻译:视觉高效图纸(VPET)已成为全面微调的有力替代方法,可以将经过训练的预视模式调整为下游任务,而下游任务只能调整少量参数,同时冻结绝大多数参数以缓解存储负担和优化难度。然而,现有的VPET方法将可训练的参数引入不同任务的不同位置,这完全取决于人的疲劳和忽略领域差距。为此,我们研究如何引入和如何分配可训练的参数,方法是提出一个新的感知性能视觉高效图纸(SPT)计划,根据预期的缓冲参数预算,将训练性能参数分配给特定任务的重要职位。具体地说,我们的小组委员会首先迅速确定敏感参数,需要以依赖数据的方式调整某项特定任务。接下来,我们的小组委员会进一步增强加权矩阵的代表性能力,其敏感参数的数量超过预先界定的阈值,方法是使用任何结构化的调压底调方法,例如,LORA或调适器,以直接调整所选定的电子敏感参数(在TRI-LOB上分别以SUDSLS-LSDSLS)的升级。</s>