We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces, focusing on manipulation of deformable objects. We propose a Latent Space Roadmap (LSR) for task planning which is a graph-based structure globally capturing the system dynamics in a low-dimensional latent space. Our framework consists of three parts: (1) a Mapping Module (MM) that maps observations given in the form of images into a structured latent space extracting the respective states as well as generates observations from the latent states, (2) the LSR which builds and connects clusters containing similar states in order to find the latent plans between start and goal states extracted by MM, and (3) the Action Proposal Module that complements the latent plan found by the LSR with the corresponding actions. We present a thorough investigation of our framework on simulated box stacking and rope/box manipulation tasks, and a folding task executed on a real robot.
翻译:我们提出了一个高维状态空间复杂操作任务的视觉行动规划框架,重点是对变形物体的操纵;我们提议了一个任务规划的远程空间路线图(LSR),这是一个基于图表的结构,在低维潜层空间全球捕捉系统动态;我们的框架由三部分组成:(1)一个映像模块(MMM),将图像形式的观测绘制成结构化的潜在空间,从各个州提取,并从潜伏状态产生观测;(2)LSR,建立和连接包含类似状态的集群,以寻找由MM所提取的起始国和目标国之间的潜在计划;(3)行动建议模块,以相应的行动补充LSR发现的潜在计划;我们提出对模拟箱堆叠和绳子/框操纵任务框架的彻底调查,以及对真正的机器人执行的折叠任务。