We consider the problem of navigating a mobile robot towards a target in an unknown environment that is endowed with visual sensors, where neither the robot nor the sensors have access to global positioning information and only use first-person-view images. In order to overcome the need for positioning, we train the sensors to encode and communicate relevant viewpoint information to the mobile robot, whose objective it is to use this information to navigate as efficiently as possible to the target. We overcome the challenge of enabling all the sensors (even those that cannot directly see the target) to predict the direction along the shortest path to the target by implementing a neighborhood-based feature aggregation module using a Graph Neural Network (GNN) architecture. In our experiments, we first demonstrate generalizability to previously unseen environments with various sensor layouts. Our results show that by using communication between the sensors and the robot, we achieve up to 2.0x improvement in SPL (Success weighted by Path Length) when compared to a communication-free baseline. This is done without requiring a global map, positioning data, nor pre-calibration of the sensor network. Second, we perform a zero-shot transfer of our model from simulation to the real world. Laboratory experiments demonstrate the feasibility of our approach in various cluttered environments. Finally, we showcase examples of successful navigation to the target while the sensor network layout is dynamically reconfigured.
翻译:我们考虑在具有视觉传感器的未知环境中,导航移动机器人到达目标的问题,其中机器人和传感器都没有访问全局定位信息,只使用第一人称视角图像。为了克服需要定位的要求,我们训练传感器对相关视点信息进行编码和通信,使移动机器人的目标是尽可能高效地使用这些信息进行导航,以到达目标。我们通过使用图神经网络(GNN)架构实现基于邻域的特征聚合模块来克服使所有传感器(甚至不能直接看到目标的传感器)预测到目标的最短路径方向的挑战。在我们的实验中,我们首先证明了其适用性,可在具有各种传感器布局的以前未见过的环境中实现。我们的结果表明,通过使用传感器和机器人之间的通信,我们相对于无通信的基线实现了多达2.0倍的 SPL(路径长度加权成功率)的改进。这一切都不需要全局映射,定位数据或传感器网络的预校准。其次,我们将模型从模拟器中进行零-shot转移到真实世界。实验室实验演示了在各种混乱环境中实施我们的方法的可行性。最后,我们展示了成功导航到目标的示例,同时重新配置了传感器网络布局。