Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models. Parameter inefficiency can however arise when, during transfer learning, all the parameters of a large pre-trained model need to be updated for individual downstream tasks. As the number of parameters grows, fine-tuning is prone to overfitting and catastrophic forgetting. In addition, full fine-tuning can become prohibitively expensive when the model is used for many tasks. To mitigate this issue, parameter-efficient transfer learning algorithms, such as adapters and prefix tuning, have been proposed as a way to introduce a few trainable parameters that can be plugged into large pre-trained language models such as BERT, and HuBERT. In this paper, we introduce the Speech UndeRstanding Evaluation (SURE) benchmark for parameter-efficient learning for various speech-processing tasks. Additionally, we introduce a new adapter, ConvAdapter, based on 1D convolution. We show that ConvAdapter outperforms the standard adapters while showing comparable performance against prefix tuning and LoRA with only 0.94% of trainable parameters on some of the task in SURE. We further explore the effectiveness of parameter efficient transfer learning for speech synthesis task such as Text-to-Speech (TTS).
翻译:微调被广泛用作从预先培训的模型中转移学习的默认算法。 但是,当在转移学习期间,大型预先培训的模型的所有参数都需要为个别下游任务更新时,可能出现效率不高的情况。随着参数数量的增加,微调容易被过度调整和灾难性的忘记。此外,当模型用于许多任务时,完全微调可能会变得过于昂贵。为了缓解这一问题,有人提议采用节能的参数传输学习算法,如适应器和前缀调试,作为引入少数可纳入大型预先培训的语言模型的可培训参数的方法,如BERT和HuBERT。我们在本文件中为各种语音处理任务的参数效率学习引入了语音不达标评价基准(SURE)。此外,我们根据1D convoluc,引入了新的适应器,ConAdapter超越了标准调试器,同时显示比前置调和LORA的可比较性参数,只有0.94 %的可培训语音参数,用于某些语音处理任务合成的测试系统。(SLTHI)中的一项工作效率测试中,作为某些可测试性参数的测试。</s>