Deep neural networks tend to reciprocate the bias of their training dataset. In object detection, the bias exists in the form of various imbalances such as class, background-foreground, and object size. In this paper, we denote size of an object as the number of pixels it covers in an image and size imbalance as the over-representation of certain sizes of objects in a dataset. We aim to address the problem of size imbalance in drone-based aerial image datasets. Existing methods for solving size imbalance are based on architectural changes that utilize multiple scales of images or feature maps for detecting objects of different sizes. We, on the other hand, propose a novel ARchitectUre-agnostic BAlanced Loss (ARUBA) that can be applied as a plugin on top of any object detection model. It follows a neighborhood-driven approach inspired by the ordinality of object size. We evaluate the effectiveness of our approach through comprehensive experiments on aerial datasets such as HRSC2016, DOTAv1.0, DOTAv1.5 and VisDrone and obtain consistent improvement in performance.
翻译:深神经网络往往对培训数据集的偏差做出对应。 在对象检测中, 偏差以各种不平衡的形式存在, 如阶级、 背景前景和对象大小等 。 在本文中, 我们表示一个对象的大小是它所覆盖的像素数量, 图像和大小不平衡是某数据集中某些大小的物体的超比例。 我们的目标是解决无人机空中图像数据集大小不平衡的问题。 现有解决大小不平衡的方法是以建筑变化为基础, 利用图像或地貌图的多重尺度来探测不同大小的物体。 另一方面, 我们提出一个新的 ARchitectUre- Anictical Balanced Loss(ARUBA), 可以在任何对象检测模型的顶部用作插件。 它遵循受天体大小或惯性启发的邻里驱动方法。 我们通过对诸如 HRSC2016、 DOTav1.0、 DOTav1.5 和 VisDrone 等航空数据集进行全面实验, 评估我们的方法的有效性, 并持续改进性能 。