While current autonomous navigation systems allow robots to successfully drive themselves from one point to another in specific environments, they typically require extensive manual parameter re-tuning by human robotics experts in order to function in new environments. Furthermore, even for just one complex environment, a single set of fine-tuned parameters may not work well in different regions of that environment. These problems prohibit reliable mobile robot deployment by non-expert users. As a remedy, we propose Adaptive Planner Parameter Learning (APPL), a machine learning framework that can leverage non-expert human interaction via several modalities -- including teleoperated demonstrations, corrective interventions, and evaluative feedback -- and also unsupervised reinforcement learning to learn a parameter policy that can dynamically adjust the parameters of classical navigation systems in response to changes in the environment. APPL inherits safety and explainability from classical navigation systems while also enjoying the benefits of machine learning, i.e., the ability to adapt and improve from experience. We present a suite of individual APPL methods and also a unifying cycle-of-learning scheme that combines all the proposed methods in a framework that can improve navigation performance through continual, iterative human interaction and simulation training.
翻译:虽然目前的自主导航系统允许机器人在特定环境中成功地从一个点向另一个点驱动自己,但它们通常需要人类机器人专家对大量人工参数进行重新校准,以便在新的环境中发挥作用;此外,即使是在一个复杂的环境中,单一的一套微调参数也可能无法在环境的不同区域很好地发挥作用;这些问题使非专家用户无法可靠地部署移动机器人;作为一种补救措施,我们提议采用适应性规划参数学习(APPL),这是一个机器学习框架,它可以通过若干模式,包括远程操作演示、纠正性干预和评估反馈,利用非专家的人类互动,以及不受监督的强化学习,学习一种参数政策,以动态地调整古典导航系统的参数,以适应环境的变化;APL继承了古典导航系统的安全和解释性,同时享受机器学习的好处,即适应和改进经验的能力。我们提出了一套个人APL方法,以及一个统一的学习周期计划,将所有拟议方法结合到一个框架,通过持续、反复的人类互动和模拟培训来改进导航性能。