For most of the object detectors based on multi-scale feature maps, the shallow layers are rich in fine spatial information and thus mainly responsible for small object detection. The performance of small object detection, however, is still less than satisfactory because of the deficiency of semantic information on shallow feature maps. In this paper, we design a Multi-scale Deconvolutional Single Shot Detector (MDSSD), especially for small object detection. In MDSSD, multiple high-level feature maps at different scales are upsampled simultaneously to increase the spatial resolution. Afterwards, we implement the skip connections with low-level feature maps via Fusion Block. The fusion feature maps, named Fusion Module, are of strong feature representational power of small instances. It is noteworthy that these high-level feature maps utilized in Fusion Block preserve both strong semantic information and some fine details of small instances, rather than the top-most layer where the representation of fine details for small objects are potentially wiped out. The proposed framework achieves 77.6% mAP for small object detection on the challenging dataset TT100K with 512 x 512 input, outperforming other detectors with a large margin. Moreover, it can also achieve state-of-the-art results for general object detection on PASCAL VOC2007 test and MS COCO test-dev2015, especially achieving 2 to 5 points improvement on small object categories.
翻译:对于基于多比例地貌图的大多数物体探测器,浅层空间信息丰富,因此主要负责小型物体探测。但是,小型物体探测的性能仍然不尽令人满意,因为浅地貌图缺乏语义信息。在本文件中,我们设计了一个多比例的进化单一射击探测器(MDSSD),特别是用于小型物体探测。在MDSSD中,不同规模的多个高层次地貌地图被同时标出,以提高空间分辨率。随后,我们通过Fusion区执行与低级别地貌图的跳过连接。名为Fusion模块的聚合地貌图具有小实例的强烈特征代表力。值得注意的是,在Fusion区使用的这些高层次地貌图既保存了强有力的语义信息,也保存了一些小实例的精细细节。在MDSSDSD中,不同规模的多个高级地貌地图被同时标出,以提高空间分辨率。拟议框架在具有挑战性的数据集TTT100K上实现了77.6%的 mAP,其中输入了512项的物体改进目标,使其他探测器无法在2007年水平上进行测试,特别是在PAS-BSAL2级,还实现了5级的测试。此外,还可以进行普通的5级的测试,还可以进行普通的5级。