Substantial efforts have been devoted more recently to presenting various methods for object detection in optical remote sensing images. However, the current survey of datasets and deep learning based methods for object detection in optical remote sensing images is not adequate. Moreover, most of the existing datasets have some shortcomings, for example, the numbers of images and object categories are small scale, and the image diversity and variations are insufficient. These limitations greatly affect the development of deep learning based object detection methods. In the paper, we provide a comprehensive review of the recent deep learning based object detection progress in both the computer vision and earth observation communities. Then, we propose a large-scale, publicly available benchmark for object DetectIon in Optical Remote sensing images, which we name as DIOR. The dataset contains 23463 images and 192472 instances, covering 20 object classes. The proposed DIOR dataset 1) is large-scale on the object categories, on the object instance number, and on the total image number; 2) has a large range of object size variations, not only in terms of spatial resolutions, but also in the aspect of inter- and intra-class size variability across objects; 3) holds big variations as the images are obtained with different imaging conditions, weathers, seasons, and image quality; and 4) has high inter-class similarity and intra-class diversity. The proposed benchmark can help the researchers to develop and validate their data-driven methods. Finally, we evaluate several state-of-the-art approaches on our DIOR dataset to establish a baseline for future research.
翻译:最近已作出大量努力,介绍在光学遥感图像中进行物体探测的各种方法,然而,目前对数据集的调查和在光学遥感图像中进行物体探测的深层次学习方法不够充分;此外,大多数现有数据集存在一些缺点,例如图像和物体类别数目小,图像多样性和变异不足;这些限制严重影响了基于深层次学习的物体探测方法的发展;在文件中,我们全面审查了最近在计算机视觉和地球观测界进行的基于深层次学习的物体探测进展;然后,我们提出了在光学遥感图像中进行物体探测的大规模公开基准,我们称之为DIOR;数据集包含23463个图像和192472个图像类别,涵盖20个物体类别;提议的DIOR数据集1,在物体类别、对象实例数目和图像总数方面大尺度;我们不仅在空间分辨率方面,而且在星级间和内部规模变异方面,我们还提出了一个大规模、公开的物体探测基准;3)数据集包含2346年图像和19247年的图例;拟议的高等级数据,我们提出了若干年期间数据,可以用来评估不同程度的图像;以及内部变异性,因为我们的类别之间的图像,可以确定各种图像。