Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or 'mental') maps. A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial. Unlike animal navigation, we can judiciously design the agent's perceptual system and control the learning paradigm to nullify alternative navigation mechanisms. Specifically, we train 'blind' agents -- with sensing limited to only egomotion and no other sensing of any kind -- to perform PointGoal navigation ('go to $\Delta$ x, $\Delta$ y') via reinforcement learning. Our agents are composed of navigation-agnostic components (fully-connected and recurrent neural networks), and our experimental setup provides no inductive bias towards mapping. Despite these harsh conditions, we find that blind agents are (1) surprisingly effective navigators in new environments (~95% success); (2) they utilize memory over long horizons (remembering ~1,000 steps of past experience in an episode); (3) this memory enables them to exhibit intelligent behavior (following walls, detecting collisions, taking shortcuts); (4) there is emergence of maps and collision detection neurons in the representations of the environment built by a blind agent as it navigates; and (5) the emergent maps are selective and task dependent (e.g. the agent 'forgets' exploratory detours). Overall, this paper presents no new techniques for the AI audience, but a surprising finding, an insight, and an explanation.
翻译:动物导航研究认为, 生物可以构建并维护内部空间代表或环境地图。 我们问机器, 具体来说, 人工智能(AI) 导航剂 -- 是否也构建隐含( 或“ imcial” ) 的地图。 这个问题的正面答案是 (a) 解释最近表面上没有地图的神经网络文献中令人惊讶的现象, 其性能很强, (b) 加强绘图证据, 以智能内装剂( 无论是生物还是人工的) 为基础进行导航。 与动物导航不同, 我们可以明智地设计该剂的感知系统, 控制学习模式, 以取消替代导航机制。 具体地说, 我们训练“ 盲目的” 剂, 其感知范围仅限于自我感化, 没有任何其它类型的感知力。 这个问题的正面答案是:( ) $\\ Delta$ x, $\\ delta y y' ) 。 (b) 我们的药剂是由智能内装药剂构成的导航- 构件( 和经常神经网络网络), 我们的实验设置不会给绘图带来任何感动的偏向。 (尽管环境, 我们发现这些困难的情况, 我们发现盲药剂在新环境里, ) ( ) ( ) 有效的导航导航导航员在新的测轨迹中, ( ) ) 在新的测轨迹中, 在新的轨迹测中, \) 在新的轨迹中, 度上, 度上, 度上, 度上, 度上, ( 度上, 度上, 度上, 度上, ( 进行着一个感测程中, ( ) 在新的轨迹中, 度上, 在新的测程中, ( ) ( ) ( ) ( ) ) ( ) ) ) ) 在新的测中, ( ) 在新的测中, 在新的测中, 在新的测程中, 在新的轨迹中, 在新的测程中, (9) 在新的测程中, (9) 度上, (9) 度上, (9) 度上, 度上, 度上, 度上行迹中, 度上行距上行迹迹迹迹迹中,