This paper introduces a novel approach to video object detection detection and tracking on Unmanned Aerial Vehicles (UAVs). By incorporating metadata, the proposed approach creates a memory map of object locations in actual world coordinates, providing a more robust and interpretable representation of object locations in both, image space and the real world. We use this representation to boost confidences, resulting in improved performance for several temporal computer vision tasks, such as video object detection, short and long-term single and multi-object tracking, and video anomaly detection. These findings confirm the benefits of metadata in enhancing the capabilities of UAVs in the field of temporal computer vision and pave the way for further advancements in this area.
翻译:本文件介绍了对无人驾驶航空器的视频物体探测探测和跟踪的新办法,通过纳入元数据,拟议办法绘制了实际世界坐标中物体位置的记忆图,对图像空间和真实世界中的物体位置提供了更加可靠和可解释的描述,我们利用这种表示来增强信心,从而改进了一些时间计算机任务的业绩,例如视频物体探测、短期和长期单一和多物体跟踪和视频异常探测,这些发现证实了元数据在加强实时计算机视野领域无人驾驶航空器的能力方面的益处,并为在这一领域进一步取得进展铺平了道路。</s>