In continual learning (CL), the goal is to design models that can learn a sequence of tasks without catastrophic forgetting. While there is a rich set of techniques for CL, relatively little understanding exists on how representations built by previous tasks benefit new tasks that are added to the network. To address this, we study the problem of continual representation learning (CRL) where we learn an evolving representation as new tasks arrive. Focusing on zero-forgetting methods where tasks are embedded in subnetworks (e.g., PackNet), we first provide experiments demonstrating CRL can significantly boost sample efficiency when learning new tasks. To explain this, we establish theoretical guarantees for CRL by providing sample complexity and generalization error bounds for new tasks by formalizing the statistical benefits of previously-learned representations. Our analysis and experiments also highlight the importance of the order in which we learn the tasks. Specifically, we show that CL benefits if the initial tasks have large sample size and high "representation diversity". Diversity ensures that adding new tasks incurs small representation mismatch and can be learned with few samples while training only few additional nonzero weights. Finally, we ask whether one can ensure each task subnetwork to be efficient during inference time while retaining the benefits of representation learning. To this end, we propose an inference-efficient variation of PackNet called Efficient Sparse PackNet (ESPN) which employs joint channel & weight pruning. ESPN embeds tasks in channel-sparse subnets requiring up to 80% less FLOPs to compute while approximately retaining accuracy and is very competitive with a variety of baselines. In summary, this work takes a step towards data and compute-efficient CL with a representation learning perspective. GitHub page: https://github.com/ucr-optml/CtRL
翻译:在持续学习中( CL), 目标是设计能够学习一系列任务而不会被灾难性忘记的模型。 虽然 CL 拥有一套丰富的技术, 但对于先前任务构建的表达方式如何能给网络添加的新任务带来新任务, 相对缺乏理解。 为了解决这个问题, 我们研究持续的代表学习( CRL) 问题, 当新任务即将到来时, 我们学习不断演变的代表学习( CRL ) 。 侧重于任务嵌入子网络( 例如 PackNet ) 的零缓冲方法, 我们首先提供实验, 证明 CRL 可以在学习新任务时大大提高样本的精度效率。 为了解释这一点, 我们为 CRL 建立理论保障, 通过将样本的精度缩精度缩缩略和一般化错误为新任务。 我们的分析和实验还凸显了我们学习任务顺序的重要性。 具体地说, 如果初始任务具有较大的样本大小和“ 代表性” 。 多样性确保添加新任务产生小的表达方式不匹配, 并且可以用少量的样本来学习, 保存更多的非零度重量 。 最后, 我们问, 能否确保每个任务的轨道的精度的精度 的精度, 在学习运行中, 的精度的精度的精度中, 学习节度的精度的精度轨道的精度的精度的精度的精度的精度的精度的精度的精度, 。