Current state-of-the-art object objectors are fine-tuned from the off-the-shelf networks pretrained on large-scale classification datasets like ImageNet, which incurs some accessory problems: 1) the domain gap between source and target datasets; 2) the learning objective bias between classification and detection; 3) the architecture limitations of the classification network for detection. In this paper, we design a new single-shot train-from-scratch object detector referring to the architectures of the ResNet and VGGNet based SSD models, called ScratchDet, to alleviate the aforementioned problems. Specifically, we study the impact of BatchNorm on training detectors from scratch, and find that using BatchNorm on the backbone and detection head subnetworks makes the detector converge well from scratch. After that, we explore the network architecture by analyzing the detection performance of ResNet and VGGNet, and introduce a new Root-ResNet backbone network to further improve the accuracy. Extensive experiments on PASCAL VOC 2007, 2012 and MS COCO datasets demonstrate that ScratchDet achieves the state-of-the-art performance among all the train-from-scratch detectors and even outperforms existing one-stage pretrained methods without bells and whistles. Codes will be made publicly available at https://github.com/KimSoybean/ScratchDet.
翻译:目前最先进的物体反对者在图像网络等大规模分类数据集上事先培训的现成网络中,对像图像网络这样的大规模分类数据集进行了微调,这引起了一些附带问题:(1) 源和目标数据集之间的域差;(2) 分类和检测之间的学习客观偏差;(3) 用于检测的分类网络的架构限制。在本文件中,我们设计了一个新的单发单发来自Scratch对象探测器,指ResNet和基于VGGNet的SSD模型结构,称为ScratchDet,以缓解上述问题。具体地说,我们研究了BatchNorm对从抓起训练探测器的影响,发现在主干线和检测头子网络上使用BatchNorm使探测器从抓起就非常接近。之后,我们通过分析ResNet和VGGNet的探测性能来探索网络结构,并引入一个新的“根-ResNet”主干网主干网,以进一步提高准确性。2007年对 PCAL VOC 和MS CO数据集的广泛实验表明,ScratchDe-Destrict 在S-Crmasial-stal-chal Sqrups be supstrops