Human-to-human conversation is not just talking and listening. It is an incremental process where participants continually establish a common understanding to rule out misunderstandings. Current language understanding methods for intelligent robots do not consider this. There exist numerous approaches considering non-understandings, but they ignore the incremental process of resolving misunderstandings. In this article, we present a first formalization and experimental validation of incremental action-repair for robotic instruction-following based on reinforcement learning. To evaluate our approach, we propose a collection of benchmark environments for action correction in language-conditioned reinforcement learning, utilizing a synthetic instructor to generate language goals and their corresponding corrections. We show that a reinforcement learning agent can successfully learn to understand incremental corrections of misunderstood instructions.
翻译:人与人之间的对话不仅仅是交谈和倾听,而是一个渐进过程,参与者不断建立共同理解,以排除误解。当前智能机器人的语言理解方法并不考虑这一点。存在许多考虑不理解的方法,但它们忽视了解决误解的渐进过程。在本篇文章中,我们首次正式和试验性地验证机器人教学的渐进行动更新,以强化学习为基础。为了评估我们的方法,我们建议收集一套基准环境,用于在语言条件强化学习中进行行动修正,使用合成教员来生成语言目标及其相应的更正。我们表明,强化学习代理可以成功地学习如何理解对错误指示的渐进更正。