Robot manipulation, a key capability of embodied AI, has turned to data-driven generative policy frameworks, but mainstream approaches like Diffusion Models suffer from high inference latency and Flow-based Methods from increased architectural complexity. While simply applying meanFlow on robotic tasks achieves single-step inference and outperforms FlowPolicy, it lacks few-shot generalization due to fixed temperature hyperparameters in its Dispersive Loss and misaligned predicted-true mean velocities. To solve these issues, this study proposes an improved MeanFlow-based Policies: we introduce a lightweight Cosine Loss to align velocity directions and use the Differential Derivation Equation (DDE) to optimize the Jacobian-Vector Product (JVP) operator. Experiments on Adroit and Meta-World tasks show the proposed method outperforms MP1 and FlowPolicy in average success rate, especially in challenging Meta-World tasks, effectively enhancing few-shot generalization and trajectory accuracy of robot manipulation policies while maintaining real-time performance, offering a more robust solution for high-precision robotic manipulation.
翻译:机器人操作作为具身人工智能的核心能力,已转向数据驱动的生成式策略框架,但主流方法如扩散模型存在推理延迟高的问题,而基于流的方法则面临架构复杂性增加的问题。虽然将均值流直接应用于机器人任务可实现单步推理且性能优于流策略,但由于其弥散损失中固定的温度超参数以及预测均值速度与真实均值速度之间的错位,该方法缺乏少样本泛化能力。为解决这些问题,本研究提出了一种改进的基于均值流的策略:我们引入了一种轻量级的余弦损失以对齐速度方向,并利用微分推导方程优化雅可比-向量积算子。在Adroit和Meta-World任务上的实验表明,所提方法在平均成功率上优于MP1和流策略,尤其在具有挑战性的Meta-World任务中表现突出,有效提升了机器人操作策略的少样本泛化能力与轨迹精度,同时保持了实时性能,为高精度机器人操作提供了更鲁棒的解决方案。