CBNetV2:用于探测物体的复合后骨网络架构 (CBNetV2: A Composite Backbone Network Architecture for Object Detection)

Modern top-performing object detectors depend heavily on backbone networks, whose advances bring consistent performance gains through exploring more effective network structures. In this paper, we propose a novel and flexible backbone framework, namely CBNetV2, to better train existing open-sourced pre-trained backbones under the pre-training fine-tuning protocol. In particular, CBNetV2 architecture groups multiple identical backbones, which are connected through composite connections. Specifically, it integrates the high- and low-level features of multiple backbone networks and gradually expands the receptive field to more efficiently perform object detection. We also propose a better training strategy with assistant supervision for CBNet-based detectors. CBNetV2 has strong generalization capabilities for different backbones and head designs of the detector architecture. Without additional pre-training, CBNetV2 can be adapted to various backbones, including manual-based and NAS-based, as well as CNN-based and Transformer-based ones. Experiments provide strong evidence showing that composite backbones are more efficient, effective, and resource-friendly than wider and deeper networks. CBNetV2 is compatible with the head designs of most mainstream detectors, including one-stage and two-stage detectors, as well as anchor-based and anchor-free-based ones, and significantly improve their performances by more than 3.0% AP over the baseline on COCO. Particularly, under the single-model and single-scale testing protocol, our Dual-Swin-L achieves 59.4% box AP and 51.6% mask AP on COCO test-dev, which is significantly better than the state-of-the-art result (i.e., 57.7% box AP and 50.2% mask AP). Code is available at https://github.com/VDIGPKU/CBNetV2.

翻译：现代高性能天体探测器高度依赖主干网,其进步通过探索更有效的网络结构带来一致的绩效收益。在本文中,我们提议了一个创新和灵活的主干框架,即CBNetV2,以在培训前微调协议下更好地培训现有的开放源码预培训骨。特别是,CBNetV2架构将多个相同的主干网组合起来,这些主干网通过复合连接连接连接。具体地说,它整合了多个主干网的高和低层次的功能,并逐步扩大可接收字段,以便更有效地进行目标探测。我们还提议了一个更好的培训战略,由CBBNet2对基于CBNet的探测器进行助理监督。CBNetV2有强大的50个主干网和探测器设计的一般化能力。没有额外的预培训,CBNetV2就可以适应各种主干网,包括基于手动和基于NAS的、基于CNNCS的和基于COF的。实验提供了有力的证据,表明复合主干网比基于BAP/AP2的、有效的和资源更方便的网络。CNetV2,在最高级的主流探测器和最高级的一级标准测试中,包括一级标准测试,在一级测试中,在一级和一级,在一级测试中进行。