Underwater navigation presents several challenges, including unstructured unknown environments, lack of reliable localization systems (e.g., GPS), and poor visibility. Furthermore, good-quality obstacle detection sensors for underwater robots are scant and costly; and many sensors like RGB-D cameras and LiDAR only work in-air. To enable reliable mapless underwater navigation despite these challenges, we propose a low-cost end-to-end navigation system, based on a monocular camera and a fixed single-beam echo-sounder, that efficiently navigates an underwater robot to waypoints while avoiding nearby obstacles. Our proposed method is based on Proximal Policy Optimization (PPO), which takes as input current relative goal information, estimated depth images, echo-sounder readings, and previous executed actions, and outputs 3D robot actions in a normalized scale. End-to-end training was done in simulation, where we adopted domain randomization (varying underwater conditions and visibility) to learn a robust policy against noise and changes in visibility conditions. The experiments in simulation and real-world demonstrated that our proposed method is successful and resilient in navigating a low-cost underwater robot in unknown underwater environments. The implementation is made publicly available at https://github.com/dartmouthrobotics/deeprl-uw-robot-navigation.
翻译:水下导航带来了若干挑战,包括无结构的未知环境、缺乏可靠的本地化系统(例如全球定位系统),以及可见度差等。此外,水下机器人的优质障碍检测传感器很少而且费用昂贵;许多传感器,如RGB-D摄像机和LIDAR只在空气中工作。尽管存在这些挑战,但为了实现可靠的无地图水下导航,我们提议建立一个低成本端对端导航系统,以单筒照相机和固定的单波束回声测距为基础,高效地将水下机器人引导到路标,同时避免附近障碍。我们提出的方法基于绝地政策优化(PPPO),该方法以当前相对目标信息、估计深度图像、回声-声-声正在读数、以往执行的行动和产出3D机器人行动为投入,在正常规模内进行。我们进行了模拟培训,在模拟中采用了域随机化(变化的水下条件和可见度),以学习防止噪音和可见性变化的强有力政策。模拟和现实世界实验表明,我们提出的方法是成功和具有复原力的,在可公开导航的低层/地下机器人/可操作环境中进行。