Physics-based motion imitation is central to humanoid control, yet current evaluation metrics (e.g., joint position error) only measure how well a policy imitates but not how difficult the motion itself is. This conflates policy performance with motion difficulty, obscuring whether failures stem from poor learning or inherently challenging motions. In this work, we address this gap with Motion Difficulty Score (MDS), a novel metric that defines and quantifies imitation difficulty independent of policy performance. Grounded in rigid-body dynamics, MDS interprets difficulty as the torque variation induced by small pose perturbations: larger torque-to-pose variation yields flatter reward landscapes and thus higher learning difficulty. MDS captures this through three properties of the perturbation-induced torque space: volume, variance, and temporal variability. We also use it to construct MD-AMASS, a difficulty-aware repartitioning of the AMASS dataset. Empirically, we rigorously validate MDS by demonstrating its explanatory power on the performance of state-of-the-art motion imitation policies. We further demonstrate the utility of MDS through two new MDS-based metrics: Maximum Imitable Difficulty (MID) and Difficulty-Stratified Joint Error (DSJE), providing fresh insights into imitation learning.
翻译:基于物理的运动模仿是人形机器人控制的核心,然而当前的评估指标(如关节位置误差)仅衡量策略模仿的优劣,而未考虑运动本身的难度。这导致策略性能与运动难度相互混淆,难以判断失败源于学习不足还是运动本身具有挑战性。本研究通过提出运动难度评分(MDS)来解决这一缺陷,该新颖指标独立于策略性能来定义和量化模仿难度。MDS基于刚体动力学理论,将难度解释为微小姿态扰动引发的扭矩变化:扭矩对姿态变化的敏感度越高,奖励函数曲面越平坦,学习难度越大。MDS通过扰动诱导扭矩空间的三个特性来捕捉这一本质:空间体积、方差和时序变异性。我们还利用MDS构建了MD-AMASS——一个基于难度重新划分的AMASS数据集。实验方面,我们通过展示MDS对当前最优运动模仿策略性能的解释能力,对其进行了严格验证。进一步地,我们基于MDS开发了两个新指标:最大可模仿难度(MID)和难度分层关节误差(DSJE),为模仿学习研究提供了新的视角。