Diffusion Policy has shown great performance in robotic manipulation tasks under stochastic perturbations, due to its ability to model multimodal action distributions. Nonetheless, its reliance on a computationally expensive reverse-time diffusion (denoising) process, for action inference, makes it challenging to use for real-time applications where quick decision-making is mandatory. This work studies the possibility of conducting the denoising process only partially before executing an action, allowing the plant to evolve according to its dynamics in parallel to the reverse-time diffusion dynamics ongoing on the computer. In a classical diffusion policy setting, the plant dynamics are usually slow and the two dynamical processes are uncoupled. Here, we investigate theoretical bounds on the stability of closed-loop systems using diffusion policies when the plant dynamics and the denoising dynamics are coupled. The contribution of this work gives a framework for faster imitation learning and a metric that yields if a controller will be stable based on the variance of the demonstrations.
翻译:扩散策略因其能够建模多模态动作分布,在随机扰动下的机器人操作任务中展现出卓越性能。然而,其在动作推理过程中依赖计算代价高昂的逆时间扩散(去噪)过程,这使得其在需要快速决策的实时应用中面临挑战。本研究探讨了在执行动作前仅部分完成去噪过程的可能性,使被控对象能够根据其自身动力学演化,同时计算机上持续进行逆时间扩散动力学过程。在经典扩散策略设定中,被控对象动力学通常较慢,且两个动力学过程互不耦合。本文研究了当被控对象动力学与去噪动力学耦合时,使用扩散策略的闭环系统稳定性的理论边界。本工作的贡献在于提出了一个更快速的模仿学习框架,以及一个基于演示方差判断控制器是否稳定的度量标准。