Outdoor navigation on sidewalks in urban environments is the key technology behind important human assistive applications, such as last-mile delivery or neighborhood patrol. This paper aims to develop a quadruped robot that follows a route plan generated by public map services, while remaining on sidewalks and avoiding collisions with obstacles and pedestrians. We devise a two-staged learning framework, which first trains a teacher agent in an abstract world with privileged ground-truth information, and then applies Behavior Cloning to teach the skills to a student agent who only has access to realistic sensors. The main research effort of this paper focuses on overcoming challenges when deploying the student policy on a quadruped robot in the real world. We propose methodologies for designing sensing modalities, network architectures, and training procedures to enable zero-shot policy transfer to unstructured and dynamic real outdoor environments. We evaluate our learning framework on a quadrupedal robot navigating sidewalks in the city of Atlanta, USA. Using the learned navigation policy and its onboard sensors, the robot is able to walk 3.2 kilometers with a limited number of human interventions.
翻译:本文旨在开发一个四重机器人,遵循公共地图服务产生的路线计划,同时保持在人行道上,避免与障碍和行人碰撞。我们设计了一个两阶段学习框架,首先在抽象世界中培训一位教师代理人,提供保密的地面真相信息,然后运用Bevior Cloining向一名只有实际感应器的学生代理人传授技能。本文的主要研究工作重点是在现实世界中部署一个四重机器人的学生政策时克服挑战。我们提出了设计遥感模式、网络架构和培训程序的方法,以便能够将零弹政策转移到没有结构的动态真实户外环境。我们评估了我们关于美国亚特兰大市四重机器人在侧行道上航行的学习框架。我们利用学习的导航政策及其机载感应器,机器人能够行走3.2公里,但人类干预次数有限。