通过进化神经网络扩展单一射击多包检测器的浅面部分 (Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network)

Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current object detection field, which uses fully convolutional neural network to detect all scaled objects in an image. Deconvolutional Single Shot Detector (DSSD) is an approach which introduces more context information by adding the deconvolution module to SSD. And the mean Average Precision (mAP) of DSSD on PASCAL VOC2007 is improved from SSD's 77.5% to 78.6%. Although DSSD obtains higher mAP than SSD by 1.1%, the frames per second (FPS) decreases from 46 to 11.8. In this paper, we propose a single stage end-to-end image detection model called ESSD to overcome this dilemma. Our solution to this problem is to cleverly extend better context information for the shallow layers of the best single stage (e.g. SSD) detectors. Experimental results show that our model can reach 79.4% mAP, which is higher than DSSD and SSD by 0.8 and 1.9 points respectively. Meanwhile, our testing speed is 25 FPS in Titan X GPU which is more than double the original DSSD.

翻译：单发多ox 检测器(SSD) 是当前天体探测场中最快的算法之一, 目前的天体探测场使用完全进化神经网络来探测图像中的所有缩放天体。进进进单射探测器( DSSD) 是一种方法, 通过将分解模块添加到 SSD 来引入更多的背景信息。而PaSCAL VOC2007 上的 DSSD 平均平均精度( mAP) 从SSD的77.5%提高到78.6%。尽管 DSSD 获得的 mAP 高于SSD的1. 1%, 每秒( FPS) 的框从46 下降到 11. 8 。在本文中, 我们提出一个名为 ESSD 的单阶段端到端图像探测模型, 以克服这一两难境地。我们解决这一问题的办法是巧妙地扩大最佳单级( 如 SSD) 探测器的浅层( mAP) 的平均精度( mAP) 。实验结果显示, 我们的模型可以达到79.4% mAP, 比 DSDDD和SSD 分别高出0.8 和1. 9。同时, 我们的测试速度为25 FPSDSDSDP 。