重新思考图像网络培训前 (Rethinking ImageNet Pre-training)

We report competitive results on object detection and instance segmentation on the COCO dataset using standard models trained from random initialization. The results are no worse than their ImageNet pre-training counterparts even when using the hyper-parameters of the baseline system (Mask R-CNN) that were optimized for fine-tuning pre-trained models, with the sole exception of increasing the number of training iterations so the randomly initialized models may converge. Training from random initialization is surprisingly robust; our results hold even when: (i) using only 10% of the training data, (ii) for deeper and wider models, and (iii) for multiple tasks and metrics. Experiments show that ImageNet pre-training speeds up convergence early in training, but does not necessarily provide regularization or improve final target task accuracy. To push the envelope we demonstrate 50.9 AP on COCO object detection without using any external data---a result on par with the top COCO 2017 competition results that used ImageNet pre-training. These observations challenge the conventional wisdom of ImageNet pre-training for dependent tasks and we expect these discoveries will encourage people to rethink the current de facto paradigm of `pre-training and fine-tuning' in computer vision.

翻译：我们使用随机初始化培训的标准模型报告COCO数据集物体探测和试例分类的竞争性结果。结果并不比图像网络培训前的对应方更差,即使使用经过优化的用于微调预培训模型的基线系统(Mask R-CNN)超参数(Mask R-CNN)优化,但唯一例外是增加培训迭代次数,这样随机初始化模型就可以合并。随机初始化培训的结果令人惊讶地强劲;即使:(一) 仅使用10%的培训数据,(二) 更深和更广泛的模型,以及(三) 多重任务和计量标准,我们的结果仍然维持着。实验表明,在培训前的图像网络加快了培训早期的趋同速度,但不一定提供正规化或提高最终目标任务的准确性。要推进我们展示50.9个COCO物体探测的AP的封口,而不使用任何外部数据-结果,与使用图像网络培训前的COCO201717最高竞争结果相当。这些观察对图像网络培训前的常规智慧提出了挑战。我们期望这些发现将鼓励人们重新思考目前对计算机进行精确和微调。

相关内容

ImageNet (数据集)

关注 21

ImageNet项目是一个用于视觉对象识别软件研究的大型可视化数据库。超过1400万的图像URL被ImageNet手动注释，以指示图片中的对象;在至少一百万个图像中，还提供了边界框。ImageNet包含2万多个类别; [2]一个典型的类别，如“气球”或“草莓”，包含数百个图像。第三方图像URL的注释数据库可以直接从ImageNet免费获得;但是，实际的图像不属于ImageNet。自2010年以来，ImageNet项目每年举办一次软件比赛，即ImageNet大规模视觉识别挑战赛（ILSVRC），软件程序竞相正确分类检测物体和场景。 ImageNet挑战使用了一个“修剪”的1000个非重叠类的列表。2012年在解决ImageNet挑战方面取得了巨大的突破，被广泛认为是2010年的深度学习革命的开始。

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日