Most object detection frameworks use backbone architectures originally designed for image classification, conventionally with pre-trained parameters on ImageNet. However, image classification and object detection are essentially different tasks and there is no guarantee that the optimal backbone for classification is also optimal for object detection. Recent neural architecture search (NAS) research has demonstrated that automatically designing a backbone specifically for object detection helps improve the overall accuracy. In this paper, we introduce a neural architecture adaptation method that can optimize the given backbone for detection purposes, while still allowing the use of pre-trained parameters. We propose to adapt both the micro- and macro-architecture by searching for specific operations and the number of layers, in addition to the output channel dimensions of each block. It is important to find the optimal channel depth, as it greatly affects the feature representation capability and computation cost. We conduct experiments with our searched backbone for object detection and demonstrate that our backbone outperforms both manually designed and searched state-of-the-art backbones on the COCO dataset.
翻译:大多数天体探测框架使用最初设计用于图像分类的骨干结构,通常在图像网络上使用经过预先训练的参数。但是,图像分类和物体探测基本上是不同的任务,不能保证分类的最佳骨干也是物体探测的最佳支柱。最近的神经结构搜索(NAS)研究表明,自动设计专门用于物体探测的骨干有助于提高总体准确性。在本文件中,我们引入了神经结构适应方法,可以优化特定骨干进行探测,同时仍然允许使用经过训练的参数。我们提议除每个街区的输出频道外,通过搜索特定操作和多层来调整微型和大型结构。重要的是找到最佳通道深度,因为它极大地影响特征描述能力和计算成本。我们用搜索的脊椎进行实验,以探测物体,并表明我们的脊椎在CO数据集上,超越了人工设计和搜索的状态。