In recent years, object detection has achieved a very large performance improvement, but the detection result of small objects is still not very satisfactory. This work proposes a strategy based on feature fusion and dilated convolution that employs dilated convolution to broaden the receptive field of feature maps at various scales in order to address this issue. On the one hand, it can improve the detection accuracy of larger objects. On the other hand, it provides more contextual information for small objects, which is beneficial to improving the detection accuracy of small objects. The shallow semantic information of small objects is obtained by filtering out the noise in the feature map, and the feature information of more small objects is preserved by using multi-scale fusion feature module and attention mechanism. The fusion of these shallow feature information and deep semantic information can generate richer feature maps for small object detection. Experiments show that this method can have higher accuracy than the traditional YOLOv3 network in the detection of small objects and occluded objects. In addition, we achieve 32.8\% Mean Average Precision on the detection of small objects on MS COCO2017 test set. For 640*640 input, this method has 88.76\% mAP on the PASCAL VOC2012 dataset.
翻译:近些年来,物体探测取得了非常大的性能改进,但小物体的探测结果仍然不尽人意。这项工作提出了一个基于地貌融合和放大变异的战略,利用放大变异法,扩大不同尺度地貌图的可接受范围,以解决这一问题。一方面,它可以提高大物体的探测准确性。另一方面,它为小物体提供了更多的背景信息,有利于提高小物体的探测准确性。小物体的浅语义信息是通过过滤地貌图中的噪音获得的,而较小物体的特征信息则通过使用多尺度聚变特征模块和关注机制加以保存。这些浅地貌信息和深层语义信息的集成可以产生较丰富的地貌图,用于小物体探测。实验表明,在探测小物体和隐蔽物体方面,该方法比传统的YOLOv3网络的准确性要高。此外,我们通过过滤地貌图中的噪音,通过多尺度聚变异特性模块,利用多尺度的聚变异特性模块和注意机制保存较小物体的特征信息。这些浅地特征信息和深层线谱信息的聚合图可以产生较丰富的小物体探测。实验显示,该方法比传统的YOLOOOVLVAS76号数据。