ARIADNE:利用以关注为基础的深水网络进行探索的强化学习方法 (ARiADNE: A Reinforcement learning approach using Attention-based Deep Networks for Exploration)

from arxiv, \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

In autonomous robot exploration tasks, a mobile robot needs to actively explore and map an unknown environment as fast as possible. Since the environment is being revealed during exploration, the robot needs to frequently re-plan its path online, as new information is acquired by onboard sensors and used to update its partial map. While state-of-the-art exploration planners are frontier- and sampling-based, encouraged by the recent development in deep reinforcement learning (DRL), we propose ARiADNE, an attention-based neural approach to obtain real-time, non-myopic path planning for autonomous exploration. ARiADNE is able to learn dependencies at multiple spatial scales between areas of the agent's partial map, and implicitly predict potential gains associated with exploring those areas. This allows the agent to sequence movement actions that balance the natural trade-off between exploitation/refinement of the map in known areas and exploration of new areas. We experimentally demonstrate that our method outperforms both learning and non-learning state-of-the-art baselines in terms of average trajectory length to complete exploration in hundreds of simplified 2D indoor scenarios. We further validate our approach in high-fidelity Robot Operating System (ROS) simulations, where we consider a real sensor model and a realistic low-level motion controller, toward deployment on real robots.

翻译：在自主的机器人探索任务中,移动机器人需要积极探索和绘制一个尽可能快的未知环境。由于环境在勘探期间被披露,机器人需要经常在网上重新规划其路径,因为新的信息是通过机载传感器获得的,并用于更新部分地图。尽管最先进的勘探规划者是以前沿和取样为基础,并得到最近深层强化学习(DRL)的开发的鼓励,我们提议阿里亚德内(AriADNE),一种以关注为基础的神经方法,为自主探索获得实时、非流星路径规划。阿里亚德尼(ARIADNE)能够学习代理人部分地图区域之间的多个空间尺度依赖性,并隐含地预测与探索这些地区相关的潜在收益。这使代理人能够排列行动顺序,平衡开发/完善已知区域地图和探索新区域之间的自然交易。我们实验性地证明,我们的方法在平均轨道长度上都超越了学习和不学习状态的基线,以完成对数百个简化的2D室内情景的探索。我们进一步验证了在探索这些区域之间的空间尺度上的潜在操作系统。我们进一步验证了在高水平上采用现实的机器人操作系统。