Target tracking in unknown real-world environments in the presence of obstacles and target motion uncertainty demand agents to develop an intrinsic understanding of the environment in order to predict the suitable actions to be taken at each time step. This task requires the agents to maximize the visibility of the mobile target maneuvering randomly in a network of roads by learning a policy that takes into consideration the various aspects of a real-world environment. In this paper, we propose a DDQN-based extension to the state-of-the-art in target tracking using a UAV TF-DQN, that we call TF-DDQN, that isolates the value estimation and evaluation steps. Additionally, in order to carefully benchmark the performance of any given target tracking algorithm, we introduce a novel target tracking evaluation scheme that quantifies its efficacy in terms of a wide set of diverse parameters. To replicate the real-world setting, we test our approach against standard baselines for the task of target tracking in complex environments with varying drift conditions and changes in environmental configuration.
翻译:在有障碍和有目标运动不确定因素的情况下,在未知的现实环境中进行目标跟踪,在存在障碍和定向运动不确定因素的情况下,目标跟踪需求代理对环境形成内在的了解,以便预测每个时间步骤要采取的合适行动。这项任务要求代理通过学习一项考虑到现实世界环境各个方面的政策,在公路网络中尽可能扩大移动目标随机操纵的可见度。在本文件中,我们提议利用UAV TF-DQN(我们称之为TF-DQN),在目标跟踪中将基于DDQN(DDQN)(我们称之为TF-DQN)(将数值估计和评价步骤分离出来),以仔细衡量任何特定目标跟踪算法的性能。此外,为了仔细衡量任何特定目标跟踪算法的性能,我们引入一个新的目标跟踪评估计划,用一套广泛的参数来量化其效率。要复制现实世界环境的设置,我们用标准基线测试我们的方法,以便在具有不同漂移条件和环境配置变化的复杂环境中进行目标跟踪的任务。