Self-supervised learning (SSL) approaches have made major strides forward by emulating the performance of their supervised counterparts on several computer vision benchmarks. This, however, comes at a cost of substantially larger model sizes, and computationally expensive training strategies, which eventually lead to larger inference times making it impractical for resource constrained industrial settings. Techniques like knowledge distillation (KD), dynamic computation (DC), and pruning are often used to obtain a lightweight sub-network, which usually involves multiple epochs of fine-tuning of a large pre-trained model, making it more computationally challenging. In this work we propose a novel perspective on the interplay between SSL and DC paradigms that can be leveraged to simultaneously learn a dense and gated (sparse/lightweight) sub-network from scratch offering a good accuracy-efficiency trade-off, and therefore yielding a generic and multi-purpose architecture for application specific industrial settings. Our study overall conveys a constructive message: exhaustive experiments on several image classification benchmarks: CIFAR-10, STL-10, CIFAR-100, and ImageNet-100, demonstrates that the proposed training strategy provides a dense and corresponding sparse sub-network that achieves comparable (on-par) performance compared with the vanilla self-supervised setting, but at a significant reduction in computation in terms of FLOPs under a range of target budgets.
翻译:自我监督的学习(SSL)方法通过模拟其监督的同行在若干计算机愿景基准上的业绩,取得了长足的进步,从而在几个计算机愿景基准上模仿其受监督的同行的业绩,从而取得了长足的进步;然而,这是以大幅扩大模型规模和计算成本昂贵的培训战略为代价的,最终导致更大的推论时间,使资源受限制的工业环境不切实际。诸如知识蒸馏(KD)、动态计算(DC)和修剪等技术往往被用来获得一个轻巧的子网络,通常涉及对一个大型的预先培训模式进行多重微调,使其在计算上更具挑战性。在这项工作中,我们提出了关于SSL和DC范例之间相互作用的新观点,可以同时利用这些模式学习一个密集和封闭的(粗略/轻轻重)亚网络,从刮中学习出一个良好的准确-效率交易,从而产生一个通用和多用途的具体工业环境应用结构。我们的研究总地传达了一种建设性的信息:在若干图像分类基准上进行详尽的实验:CIFAR-10、STL-10、CIFAR-100和图像网络-100,以及图像网络-Net-100,我们提出了一个新的观点观点,可以同时利用这种视角,以同时学习学习学习一个密集和可比的、可比较的精密的精细的精细的计算方法,以建立一个可进行自我递化的精细的精细的计算。