Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations, providing a more practical solution for effective tracking in dynamic environments. However, accurate Drone Visual Active Tracking using reinforcement learning remains challenging due to the absence of a unified benchmark and the complexity of open-world environments with frequent interference. To address these issues, we pioneer a systematic solution. First, we propose DAT, the first open-world drone active air-to-ground tracking benchmark. It encompasses 24 city-scale scenes, featuring targets with human-like behaviors and high-fidelity dynamics simulation. DAT also provides a digital twin tool for unlimited scene generation. Additionally, we propose a novel reinforcement learning method called GC-VAT, which aims to improve the performance of drone tracking targets in complex scenarios. Specifically, we design a Goal-Centered Reward to provide precise feedback across viewpoints to the agent, enabling it to expand perception and movement range through unrestricted perspectives. Inspired by curriculum learning, we introduce a Curriculum-Based Training strategy that progressively enhances the tracking performance in complex environments. Besides, experiments on simulator and real-world images demonstrate the superior performance of GC-VAT, achieving a Tracking Success Rate of approximately 72% on the simulator. The benchmark and code are available at https://github.com/SHWplus/DAT_Benchmark.
翻译:无人机视觉主动跟踪旨在基于视觉观测控制运动系统自主跟随目标物体,为动态环境中的有效跟踪提供更实用的解决方案。然而,由于缺乏统一的基准测试以及开放世界环境中频繁干扰的复杂性,使用强化学习实现精确的无人机视觉主动跟踪仍然具有挑战性。为解决这些问题,我们开创了一种系统性解决方案。首先,我们提出了DAT,首个面向开放世界的无人机主动空对地跟踪基准测试。它涵盖24个城市规模场景,具有类人行为目标和高保真动力学模拟。DAT还提供了一个用于无限场景生成的数字孪生工具。此外,我们提出了一种名为GC-VAT的新型强化学习方法,旨在提升无人机在复杂场景中跟踪目标的性能。具体而言,我们设计了一种以目标为中心的奖励机制,为智能体提供跨视角的精确反馈,使其能够通过不受限制的视角扩展感知和运动范围。受课程学习启发,我们引入了一种基于课程的学习训练策略,逐步提升在复杂环境中的跟踪性能。此外,在模拟器和真实世界图像上的实验证明了GC-VAT的优越性能,在模拟器上实现了约72%的跟踪成功率。基准测试和代码可在 https://github.com/SHWplus/DAT_Benchmark 获取。