Aerial scene classification remains challenging as: 1) the size of key objects in determining the scene scheme varies greatly; 2) many objects irrelevant to the scene scheme are often flooded in the image. Hence, how to effectively perceive the region of interests (RoIs) from a variety of sizes and build more discriminative representation from such complicated object distribution is vital to understand an aerial scene. In this paper, we propose a novel all grains, one scheme (AGOS) framework to tackle these challenges. To the best of our knowledge, it is the first work to extend the classic multiple instance learning into multi-grain formulation. Specially, it consists of a multi-grain perception module (MGP), a multi-branch multi-instance representation module (MBMIR) and a self-aligned semantic fusion (SSF) module. Firstly, our MGP preserves the differential dilated convolutional features from the backbone, which magnifies the discriminative information from multi-grains. Then, our MBMIR highlights the key instances in the multi-grain representation under the MIL formulation. Finally, our SSF allows our framework to learn the same scene scheme from multi-grain instance representations and fuses them, so that the entire framework is optimized as a whole. Notably, our AGOS is flexible and can be easily adapted to existing CNNs in a plug-and-play manner. Extensive experiments on UCM, AID and NWPU benchmarks demonstrate that our AGOS achieves a comparable performance against the state-of-the-art methods.
翻译:空中场景分类仍然具有挑战性,因为:1)确定现场方案的关键物体大小差异很大;2)与现场方案无关的许多物体往往被图像淹没。因此,如何从各种大小中有效地认识利益区域(RoIs)和从如此复杂的物体分布中建立更具有歧视性的代表性,对于了解空中场景至关重要。在本文件中,我们提出了一个小说,所有谷物,一个方案(AGOS)框架来应对这些挑战。根据我们的知识,这是将典型的多重实例学习基准扩展为多重力制成的首项工作。特别是,它由多重感知模块(MGP)、多处多处多处代表模块(MBMIR)和自我调整的语义融合模块(SSF)组成。首先,我们的MGP从骨干中保留差异的变相变形变形变形变形变形的变形变形特征,放大多重信息。然后,我们的MBIIR在MIL拟订的多重代表制中突出了关键实例。最后,我们的SFSF框架让我们的系统框架能够轻松地从一个可比较的缩缩缩缩缩化的AGIS框架,从现有的模化模型中学习。