Planning in realistic environments requires searching in large planning spaces. Affordances are a powerful concept to simplify this search, because they model what actions can be successful in a given situation. However, the classical notion of affordance is not suitable for long horizon planning because it only informs the robot about the immediate outcome of actions instead of what actions are best for achieving a long-term goal. In this paper, we introduce a new affordance representation that enables the robot to reason about the long-term effects of actions through modeling what actions are afforded in the future, thereby informing the robot the best actions to take next to achieve a task goal. Based on the new representation, we develop a learning-to-plan method, Deep Affordance Foresight (DAF), that learns partial environment models of affordances of parameterized motor skills through trial-and-error. We evaluate DAF on two challenging manipulation domains and show that it can effectively learn to carry out multi-step tasks, share learned affordance representations among different tasks, and learn to plan with high-dimensional image inputs. Additional material is available at https://sites.google.com/stanford.edu/daf
翻译:现实环境中的规划要求在大型规划空间进行搜索。 价格是简化这一搜索的强大概念, 因为它们模拟了在特定情况下可以成功的行动。 但是, 传统的价格概念并不适合于长期规划, 因为它只告知机器人行动的直接结果, 而不是实现长期目标的最佳行动。 在本文件中, 我们引入一个新的价格代表, 使机器人能够通过模拟未来行动来理解行动的长期影响, 从而告知机器人下一步要采取的最佳行动, 以实现任务目标。 根据新的代表, 我们开发了一个学习到计划的方法, Deep Affordance Foresight(DAFF), 通过试验/ error 来学习参数化运动技能的局部环境模式。 我们从两个具有挑战性的操纵领域对DAF进行评估, 并表明它能够有效地学会执行多步任务, 在不同任务中分享学习的支付能力, 并学习高维度图像投入的规划。 更多材料可在 https://site.gogle. com/ford.