Manipulating clusters of deformable objects presents a substantial challenge with widespread applicability, but requires contact-rich whole-arm interactions. A potential solution must address the limited capacity for realistic model synthesis, high uncertainty in perception, and the lack of efficient spatial abstractions, among others. We propose a novel framework for learning model-free policies integrating two modalities: 3D point clouds and proprioceptive touch indicators, emphasising manipulation with full body contact awareness, going beyond traditional end-effector modes. Our reinforcement learning framework leverages a distributional state representation, aided by kernel mean embeddings, to achieve improved training efficiency and real-time inference. Furthermore, we propose a novel context-agnostic occlusion heuristic to clear deformables from a target region for exposure tasks. We deploy the framework in a power line clearance scenario and observe that the agent generates creative strategies leveraging multiple arm links for de-occlusion. Finally, we perform zero-shot sim-to-real policy transfer, allowing the arm to clear real branches with unknown occlusion patterns, unseen topology, and uncertain dynamics. Website: https://sites.google.com/view/dcmwap/
翻译:操控可变形物体集群是一项具有广泛适用性的重大挑战,但需要密集接触的全身臂式交互。可行的解决方案必须应对现实模型合成能力有限、感知高度不确定以及缺乏高效空间抽象等问题。我们提出了一种新颖的无模型策略学习框架,该框架融合了两种模态:三维点云与本体触觉指示器,强调在全身接触感知下的操控,超越了传统的末端执行器模式。我们的强化学习框架利用基于核均值嵌入的分布状态表示,以提高训练效率并实现实时推理。此外,我们提出了一种新颖的上下文无关遮挡启发式方法,用于在暴露任务中从目标区域清除可变形物体。我们将该框架部署于电力线清理场景,并观察到智能体能够利用多个臂杆链接生成创造性的去遮挡策略。最后,我们执行了零样本仿真到现实策略迁移,使机械臂能够清理具有未知遮挡模式、未见拓扑结构及不确定动力学的真实树枝。项目网站:https://sites.google.com/view/dcmwap/