A common practice in transfer learning is to initialize the downstream model weights by pre-training on a data-abundant upstream task. In object detection specifically, the feature backbone is typically initialized with Imagenet classifier weights and fine-tuned on the object detection task. Recent works show this is not strictly necessary under longer training regimes and provide recipes for training the backbone from scratch. We investigate the opposite direction of this end-to-end training trend: we show that an extreme form of knowledge preservation -- freezing the classifier-initialized backbone -- consistently improves many different detection models, and leads to considerable resource savings. We hypothesize and corroborate experimentally that the remaining detector components capacity and structure is a crucial factor in leveraging the frozen backbone. Immediate applications of our findings include performance improvements on hard cases like detection of long-tail object classes and computational and memory resource savings that contribute to making the field more accessible to researchers with access to fewer computational resources.
翻译:转移学习的一个常见做法是,通过对数据丰富的上游任务进行预先培训,开始下游模型加权;具体来说,在物体探测方面,特征主干通常是用图像网分类器重量初始化,并微调物体探测任务。最近的工作表明,在较长的培训制度下,这并非绝对必要,而是从头到尾培训骨干的方法。我们对这一端至端培训趋势的相反方向进行了调查:我们显示,一种极端的知识保存形式 -- -- 冻结分类器初始骨干 -- -- 不断改进许多不同的探测模型,并导致大量资源节约。我们虚微和实验性地证实,其余的探测器元件能力和结构是利用冷冻骨干的一个关键因素。我们发现结果的近期应用包括改进在诸如探测长尾物体班和计算和记忆资源节约等困难案例上的性能,这有助于研究人员更方便利用较少的计算资源。