We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation. Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-weight training phase. Our experiments demonstrate that the CoDA approach provides an unexpectedly efficient way to transfer knowledge. Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter approach with moderate to no accuracy loss and the same parameter efficiency.
翻译:我们提出了条件适配器(CoDA),这是一种参数有效的迁移学习方法,还可以提高推理效率。 CoDA超越了标准适配器方法,通过条件计算实现了一种平衡速度和准确性的新方法。从现有的稠密预训练模型开始,CoDA添加了稀疏激活以及少量的新参数和轻量级的训练阶段。我们的实验表明,CoDA方法提供了一个出人意料的传递知识的高效方法。在各种语言、视觉和语音任务中,CoDA实现了与最先进的适配器方法相比2倍至8倍的推理加速,准确性中等到没有损失且具有相同的参数效率。