Domain-adaptive trajectory imitation is a skill that some predators learn for survival, by mapping dynamic information from one domain (their speed and steering direction) to a different domain (current position of the moving prey). An intelligent agent with this skill could be exploited for a diversity of tasks, including the recognition of abnormal motion in traffic once it has learned to imitate representative trajectories. Towards this direction, we propose DATI, a deep reinforcement learning agent designed for domain-adaptive trajectory imitation using a cycle-consistent generative adversarial method. Our experiments on a variety of synthetic families of reference trajectories show that DATI outperforms baseline methods for imitation learning and optimal control in this setting, keeping the same per-task hyperparameters. Its generalization to a real-world scenario is shown through the discovery of abnormal motion patterns in maritime traffic, opening the door for the use of deep reinforcement learning methods for spatially-unconstrained trajectory data mining.
翻译:领域适应轨迹模仿是一种有些捕食者为了生存而学习的技能,通过将不同领域中的动态信息(如它们的速度和转向方向)映射到另一个领域(移动猎物的当前位置)来实现。具有此技能的智能代理可以被用于各种任务,包括在学习了代表性轨迹的情况下识别交通中的异常运动。为此,我们提出了DATI,一种使用循环一致的生成对抗方法进行领域适应轨迹模仿的深度强化学习代理。我们在各种合成的参考轨迹家族上进行实验,结果表明DATI在此设置中优于基线方法和最优控制,同时保持相同的每个任务的超参数。我们通过发现海上交通的异常运动模式展示了其在现实世界情景中的普适性,为使用深度强化学习方法进行空间无限制的轨迹数据挖掘打开了大门。