Object detectors are usually equipped with backbone networks designed for image classification. It might be sub-optimal because of the gap between the tasks of image classification and object detection. In this work, we present DetNAS to use Neural Architecture Search (NAS) for the design of better backbones for object detection. It is non-trivial because detection training typically needs ImageNet pre-training while NAS systems require accuracies on the target detection task as supervisory signals. Based on the technique of one-shot supernet, which contains all possible networks in the search space, we propose a framework for backbone search on object detection. We train the supernet under the typical detector training schedule: ImageNet pre-training and detection fine-tuning. Then, the architecture search is performed on the trained supernet, using the detection task as the guidance. This framework makes NAS on backbones very efficient. In experiments, we show the effectiveness of DetNAS on various detectors, for instance, one-stage RetinaNet and the two-stage FPN. We empirically find that networks searched on object detection shows consistent superiority compared to those searched on ImageNet classification. The resulting architecture achieves superior performance than hand-crafted networks on COCO with much less FLOPs complexity.
翻译:由于图像分类任务与物体探测任务之间的差距,探测器通常配有用于图像分类的骨干网络。由于图像分类任务与物体探测任务之间的差距,探测器可能是亚最佳的。在这项工作中,我们介绍DetNAS使用神经结构搜索(NAS)设计更好的天体探测主干线设计。这是非三边的,因为探测培训通常需要图像网络预培训,而NAS系统则要求将目标检测任务作为监督信号,因为探测培训通常需要图像网络预培训,而NAS系统则要求将目标检测任务作为监督信号。根据包含搜索空间中所有可能网络的一发超网技术,我们提议了一个对物体探测进行主干线搜索的框架。我们在典型的探测器培训时间表下对超级网络进行了培训:图像网络预培训和探测微调。然后,在经过培训的超级网络上进行结构搜索,使用探测任务作为指导。这个框架使NAS在骨干上非常高效。在实验中,我们展示了DatNAS在各种探测器上的有效性,例如,一个级RetinNet和两阶段FPN。我们从实验中发现,在搜索的物体探测网络上所搜索的网络显示的是,与在图像网络上比那些在图像网络上搜索的精度比复杂度高的计算机分类上达到更高的性。