On-device learning enables edge devices to continually adapt the AI models to new data, which requires a small memory footprint to fit the tight memory constraint of edge devices. Existing work solves this problem by reducing the number of trainable parameters. However, this doesn't directly translate to memory saving since the major bottleneck is the activations, not parameters. In this work, we present Tiny-Transfer-Learning (TinyTL) for memory-efficient on-device learning. TinyTL freezes the weights while only learns the bias modules, thus no need to store the intermediate activations. To maintain the adaptation capacity, we introduce a new memory-efficient bias module, the lite residual module, to refine the feature extractor by learning small residual feature maps adding only 3.8% memory overhead. Extensive experiments show that TinyTL significantly saves the memory (up to 6.5x) with little accuracy loss compared to fine-tuning the full network. Compared to fine-tuning the last layer, TinyTL provides significant accuracy improvements (up to 34.1%) with little memory overhead. Furthermore, combined with feature extractor adaptation, TinyTL provides 7.3-12.9x memory saving without sacrificing accuracy compared to fine-tuning the full Inception-V3.
翻译:在线学习使边缘设备能够不断使 AI 模型适应新数据, 这需要少量的记忆足迹来适应边缘设备严格的内存限制。 现有的工作通过减少可训练参数的数量来解决这个问题。 但是, 这并不直接转化为记忆保存, 因为主要的瓶颈是激活, 而不是参数。 在这项工作中, 我们展示了 Tiny- Transfer- Learning (TinyTL) 来进行记忆高效的脱机学习。 TinyTL 将重量冻结在仅仅学习偏差模块的同时, 不需要存储中间激活。 为了保持适应能力, 我们引入了一个新的记忆高效的偏差模块, 即闪光残余模块, 通过学习小的残余特性图来精细化特性提取器, 仅增加3.8%的内存管理费。 广泛的实验显示, TinyTL 与微调整个网络相比, 大大节省了记忆( 高达6. 5x) 的精度损失很少。 与微调最后一个层相比, TinyTL 提供显著的精度改进( 到34.1-1 % ) 和微的内存存储精度微微的内存管理。 此外, 与微的精度调整Tin- 3 的精确比较, 。