空中视觉和迪亚洛格导航 (Aerial Vision-and-Dialog Navigation)

The ability to converse with humans and follow commands in natural language is crucial for intelligent unmanned aerial vehicles (a.k.a. drones). It can relieve people's burden of holding a controller all the time, allow multitasking, and make drone control more accessible for people with disabilities or with their hands occupied. To this end, we introduce Aerial Vision-and-Dialog Navigation (AVDN), to navigate a drone via natural language conversation. We build a drone simulator with a continuous photorealistic environment and collect a new AVDN dataset of over 3k recorded navigation trajectories with asynchronous human-human dialogs between commanders and followers. The commander provides initial navigation instruction and further guidance by request, while the follower navigates the drone in the simulator and asks questions when needed. During data collection, followers' attention on the drone's visual observation is also recorded. Based on the AVDN dataset, we study the tasks of aerial navigation from (full) dialog history and propose an effective Human Attention Aided (HAA) baseline model, which learns to predict both navigation waypoints and human attention. Dataset and code will be released.

翻译：与人交流和遵守自然语言指令的能力对于智能无人驾驶飞行器(a.k.a.a.无人机)至关重要,它可以减轻人们始终持有控制器的负担,允许多任务,并使残疾人或手被占用的人更容易获得无人机控制。为此,我们引入空中视觉和数字导航(AVDN),通过自然语言对话导航无人机。我们建造了无人机模拟器,具有连续的光现实环境,并收集了3k以上记录过的导航轨迹的AVDN新数据集,其中含有指挥官和追随者之间无同步的人类对话。指挥官应请求提供初始导航指示和进一步指导,而追随者则在模拟器中导航无人机,必要时提出问题。在数据收集过程中,还记录了对无人机视觉观察的注意力。我们根据AVDN数据集,研究了从(全面)对话历史中收集的航空导航任务,并提出了有效的人类注意力辅助基线模型(HAAAA),以学习如何预测导航点和释放数据。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日