Whilst computer vision models built using self-supervised approaches are now commonplace, some important questions remain. Do self-supervised models learn highly redundant channel features? What if a self-supervised network could dynamically select the important channels and get rid of the unnecessary ones? Currently, convnets pre-trained with self-supervision have obtained comparable performance on downstream tasks in comparison to their supervised counterparts in computer vision. However, there are drawbacks to self-supervised models including their large numbers of parameters, computationally expensive training strategies and a clear need for faster inference on downstream tasks. In this work, our goal is to address the latter by studying how a standard channel selection method developed for supervised learning can be applied to networks trained with self-supervision. We validate our findings on a range of target budgets $t_{d}$ for channel computation on image classification task across different datasets, specifically CIFAR-10, CIFAR-100, and ImageNet-100, obtaining comparable performance to that of the original network when selecting all channels but at a significant reduction in computation reported in terms of FLOPs.
翻译:使用自我监督方法建造的计算机远景模型现在很普遍,但仍然存在一些重要问题。自监督模型是否学会了高度冗余的频道特征?如果自监督的网络能够动态地选择重要频道并摆脱不必要的频道特征呢?目前,经过自我监督预先训练的通信网在下游任务上取得了与计算机愿景中受监督的同行相比的可比业绩。然而,自我监督模型,包括数量众多的参数、计算成本昂贵的培训策略和明显需要更快地推断下游任务等,也有缺点。在这项工作中,我们的目标是研究如何将为监督学习而开发的标准频道选择方法应用于经过自我监督的网络。我们验证关于一系列目标预算的研究结果,用于在不同数据集,特别是CIFAR-10、CIFAR-100和图像网络-100之间对图像分类任务进行频道计算,在选择所有频道时取得与原始网络的类似业绩,但在计算中报告的FLOP方面显著减少。