Feature pyramid network (FPN) has been an effective framework to extract multi-scale features in object detection. However, current FPN-based methods mostly suffer from the intrinsic flaw of channel reduction, which brings about the loss of semantical information. And the miscellaneous fused feature maps may cause serious aliasing effects. In this paper, we present a novel channel enhancement feature pyramid network (CE-FPN) with three simple yet effective modules to alleviate these problems. Specifically, inspired by sub-pixel convolution, we propose a sub-pixel skip fusion method to perform both channel enhancement and upsampling. Instead of the original 1x1 convolution and linear upsampling, it mitigates the information loss due to channel reduction. Then we propose a sub-pixel context enhancement module for extracting more feature representations, which is superior to other context methods due to the utilization of rich channel information by sub-pixel convolution. Furthermore, a channel attention guided module is introduced to optimize the final integrated features on each level, which alleviates the aliasing effect only with a few computational burdens. Our experiments show that CE-FPN achieves competitive performance compared to state-of-the-art FPN-based detectors on MS COCO benchmark.
翻译:然而,目前基于FPN的方法大多存在频道减少的内在缺陷,从而导致音义信息丢失。此外,杂项引信特性图可能会造成严重的别名效应。在本文件中,我们展示了一个新的频道增强特征金字塔网络(CE-FPN),有三个简单而有效的模块来缓解这些问题。具体地说,在次像素结合的启发下,我们提出了一个子像素跳跃混合方法,以进行频道增强和升级。而不是最初的1x1混合和线性放大,而是减轻由于频道减少而造成的信息损失。然后我们提出一个子像素背景增强模块,以提取更多的特征表现,这优于其他环境方法,因为通过次像素结合利用丰富的频道信息来缓解这些问题。此外,我们引入了一条引导引导模块,以优化每个级别的最后集成特性,这只能通过少量的计算负担来减轻别名效果。我们关于CE-CO-CO测试基准性能比FCO-CO-CMAR的测试显示C-C-CO-CO-CO-CO-CO-CO-CRestorizal-CServical-C-CRestorizal-CSal-CSal-C-C-CMest-C-C-C-C-C-PARs-C-C-C-C-C-C-C-C-C-PAR-S-S-C-C-C-C-C-C-C-C-S-C-C-C-PAR-C-CS-CS-CS-C-C-C-CSDSDSDS-C-C-C-C-C-PAR-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-PAR-C-C-C-C-C-C-C-C-C-C-C-C-C-C-PAR-C-C-C-C-C-C-C-C-C-