Computer vision-based object detection is a key modality for advanced Detect-And-Avoid systems that allow for autonomous flight missions of UAVs. While standard object detection frameworks do not predict the actual depth of an object, this information is crucial to avoid collisions. In this paper, we propose several novel extensions to state-of-the-art methods for monocular object detection from images at long range. Firstly, we propose Sigmoid and ReLU-like encodings when modeling depth estimation as a regression task. Secondly, we frame the depth estimation as a classification problem and introduce a Soft-Argmax function in the calculation of the training loss. The extensions are exemplarily applied to the YOLOX object detection framework. We evaluate the performance using the Amazon Airborne Object Tracking dataset. In addition, we introduce the Fitness score as a new metric that jointly assesses both object detection and depth estimation performance. Our results show that the proposed methods outperform state-of-the-art approaches w.r.t. existing, as well as the proposed metrics.
翻译:基于计算机的目视物体探测是先进的探测和避免系统的一种关键模式,允许无人驾驶飞行器进行自主飞行飞行任务。虽然标准物体探测框架没有预测物体的实际深度,但这一信息对于避免碰撞至关重要。在本文件中,我们提议对远程图像中单向物体探测的最先进方法进行若干新的扩展。首先,我们提议在将深度估计作为回归任务进行模拟时采用Sigmoid和RelU类编码。第二,我们将深度估计作为一个分类问题,并在计算培训损失时引入软性Argmax函数。扩展功能被举例地应用到YOLOX天体探测框架。我们利用亚马逊天体跟踪数据集评估性能。此外,我们引入了“适当性评分”作为新指标,共同评估天体探测和深度估计性能。我们的结果显示,拟议的方法超越了现有的最新方法以及拟议指标。