This paper studies feature pyramid network (FPN), which is a widely used module for aggregating multi-scale feature information in the object detection system. The performance gain in most of the existing works is mainly contributed to the increase of computation burden, especially the floating number operations (FLOPs). In addition, the multi-scale information within each layer in FPN has not been well investigated. To this end, we first introduce an inception FPN in which each layer contains convolution filters with different kernel sizes to enlarge the receptive field and integrate more useful information. Moreover, we point out that not all objects need such a complicated calculation module and propose a new dynamic FPN (DyFPN). Each layer in the DyFPN consists of multiple branches with different computational costs. Specifically, the output features of DyFPN will be calculated by using the adaptively selected branch according to a learnable gating operation. Therefore, the proposed method can provide a more efficient dynamic inference for achieving a better trade-off between accuracy and detection performance. Extensive experiments conducted on benchmarks demonstrate that the proposed DyFPN significantly improves performance with the optimal allocation of computation resources. For instance, replacing the FPN with the inception FPN improves detection accuracy by 1.6 AP using the Faster R-CNN paradigm on COCO minival, and the DyFPN further reduces about 40% of its FLOPs while maintaining similar performance.
翻译:本文研究的特点是金字塔网络(FPN),这是一个广泛使用的模块,用于汇集物体探测系统中的多尺度特征信息,大多数现有工程的性能收益主要有助于增加计算负担,特别是浮动数字操作(FLOPs),此外,FPN每一层内的多尺度信息没有很好调查,为此,我们首先推出一个初始FPN,其中每个层含有具有不同内核大小的组合过滤器,以扩大可接收字段,并纳入更有用的信息。此外,我们指出,并非所有物体都需要这样一个复杂的计算模块,并提议一个新的动态FPN(DyFPN)。DyFPN的每个层由多个分支组成,计算成本不同。具体来说,DyFPN的输出特征将通过适应性选择的分支根据可学习的定位操作进行计算。因此,拟议的方法可以提供更高效的动态推论,以更好地实现准确度和检测性能之间的取舍。对基准进行的广泛实验表明,拟议的DCNFPNNN(DFPN)明显改进了业绩,同时以最精确的RFPN标准开始,用FPN(FPN)改进了FPN)和FPN的精确度。