Inspired by Regularized Lottery Ticket Hypothesis (RLTH), which states that competitive smooth (non-binary) subnetworks exist within a dense network in continual learning tasks, we investigate two proposed architecture-based continual learning methods which sequentially learn and select adaptive binary- (WSN) and non-binary Soft-Subnetworks (SoftNet) for each task. WSN and SoftNet jointly learn the regularized model weights and task-adaptive non-binary masks of subnetworks associated with each task whilst attempting to select a small set of weights to be activated (winning ticket) by reusing weights of the prior subnetworks. Our proposed WSN and SoftNet are inherently immune to catastrophic forgetting as each selected subnetwork model does not infringe upon other subnetworks in Task Incremental Learning (TIL). In TIL, binary masks spawned per winning ticket are encoded into one N-bit binary digit mask, then compressed using Huffman coding for a sub-linear increase in network capacity to the number of tasks. Surprisingly, in the inference step, SoftNet generated by injecting small noises to the backgrounds of acquired WSN (holding the foregrounds of WSN) provides excellent forward transfer power for future tasks in TIL. SoftNet shows its effectiveness over WSN in regularizing parameters to tackle the overfitting, to a few examples in Few-shot Class Incremental Learning (FSCIL).
翻译:本文受到正则化彩票票据假设(RLTH)的启发,RLTH 认为在连续学习任务中,一个稠密网络内存在着具有竞争性的光滑(非二进制)子网络。因此,我们研究了两种提出的基于架构的连续学习方法,它们为每个任务顺序学习和选择自适应的二进制(WSN)和非二进制软子网络(SoftNet)。WSN 和 SoftNet 同时学习与每个任务相关的子网络的规则化模型权重和任务自适应的非二进制掩码,同时尝试选择一小组权重(获胜的票据),以通过重用先前子网络的权重来激活它们。我们提出的 WSN 和 SoftNet 在 TIL 中天然免疫灾难性遗忘,因为每个选定的子网络模型在 TIL 中不侵犯其他子网络。在 TIL 中,生成的每个获胜票据的二进制掩码被编码成一个 N 位二进制数字掩码,然后使用霍夫曼编码进行压缩,从而使网络容量随任务数的增加呈亚线性增长。令人惊讶的是,在推理步骤中,通过向获得的 WSN 注入小的噪声(保持 WSN 的前景)生成的 SoftNet 具有出色的未来任务前向传递能力。在应对少量样本 FSCIL 中,SoftNet 在规范化参数方面显示出比 WSN 更有效的能力。