The recent advances of compressing high-accuracy convolution neural networks (CNNs) have witnessed remarkable progress for real-time object detection. To accelerate detection speed, lightweight detectors always have few convolution layers using single-path backbone. Single-path architecture, however, involves continuous pooling and downsampling operations, always resulting in coarse and inaccurate feature maps that are disadvantageous to locate objects. On the other hand, due to limited network capacity, recent lightweight networks are often weak in representing large scale visual data. To address these problems, this paper presents a dual-path network, named DPNet, with a lightweight attention scheme for real-time object detection. The dual-path architecture enables us to parallelly extract high-level semantic features and low-level object details. Although DPNet has nearly duplicated shape with respect to single-path detectors, the computational costs and model size are not significantly increased. To enhance representation capability, a lightweight self-correlation module (LSCM) is designed to capture global interactions, with only few computational overheads and network parameters. In neck, LSCM is extended into a lightweight crosscorrelation module (LCCM), capturing mutual dependencies among neighboring scale features. We have conducted exhaustive experiments on MS COCO and Pascal VOC 2007 datasets. The experimental results demonstrate that DPNet achieves state-of the-art trade-off between detection accuracy and implementation efficiency. Specifically, DPNet achieves 30.5% AP on MS COCO test-dev and 81.5% mAP on Pascal VOC 2007 test set, together mwith nearly 2.5M model size, 1.04 GFLOPs, and 164 FPS and 196 FPS for 320 x 320 input images of two datasets.
翻译:压缩高精度神经神经网络(CNNs)的最新进步在实时物体探测方面取得了显著进展。为了加快检测速度,轻量检测器总是很少使用单一路径主干体的变异层。但是,单一路径结构涉及连续集合和下取样操作,结果总是粗糙和不准确的特征图,不利于定位物体。另一方面,由于网络能力有限,最近的轻量网络在代表大规模视觉数据方面往往很薄弱。为了解决这些问题,本文件展示了一个双向网络网络,名为DPNet,有轻度关注实时物体探测计划。双路结构使我们能够同时提取高层次的静态特征和低水平对象细节。虽然DPNet几乎具有与单向探测器几乎重复的形状,但计算成本和模型大小并没有显著提高。为了提高代表能力,一个轻度的自我定位模型(LSCMM),旨在捕捉全球互动,只有很少的计算式主控和网络参数。在脖子上,LSCMM(SC) 和MOC(SIM) 的双级测试模型,在2007年水平测试模型中,我们获得了了二度的硬质测试模型。