SMP：基于物理角色控制的可复用分数匹配运动先验 (SMP: Reusable Score-Matching Motion Priors for Physics-Based Character Control)

Data-driven motion priors that can guide agents toward producing naturalistic behaviors play a pivotal role in creating life-like virtual characters. Adversarial imitation learning has been a highly effective method for learning motion priors from reference motion data. However, adversarial priors, with few exceptions, need to be retrained for each new controller, thereby limiting their reusability and necessitating the retention of the reference motion data when training on downstream tasks. In this work, we present Score-Matching Motion Priors (SMP), which leverages pre-trained motion diffusion models and score distillation sampling (SDS) to create reusable task-agnostic motion priors. SMPs can be pre-trained on a motion dataset, independent of any control policy or task. Once trained, SMPs can be kept frozen and reused as general-purpose reward functions to train policies to produce naturalistic behaviors for downstream tasks. We show that a general motion prior trained on large-scale datasets can be repurposed into a variety of style-specific priors. Furthermore SMP can compose different styles to synthesize new styles not present in the original dataset. Our method produces high-quality motion comparable to state-of-the-art adversarial imitation learning methods through reusable and modular motion priors. We demonstrate the effectiveness of SMP across a diverse suite of control tasks with physically simulated humanoid characters. Video demo available at https://youtu.be/ravlZJteS20

翻译：数据驱动的运动先验能够引导智能体产生自然行为，在创建逼真虚拟角色中发挥着关键作用。对抗性模仿学习已成为从参考运动数据中学习运动先验的高效方法。然而，除少数例外，对抗性先验通常需要针对每个新控制器重新训练，这限制了其可复用性，并在下游任务训练时需保留参考运动数据。本研究提出分数匹配运动先验（SMP），该方法利用预训练的运动扩散模型和分数蒸馏采样（SDS）构建可复用的任务无关运动先验。SMP可在运动数据集上独立于任何控制策略或任务进行预训练。训练完成后，SMP可保持冻结状态并复用为通用奖励函数，用于训练下游任务中产生自然行为的策略。我们证明，基于大规模数据集训练的通用运动先验可转化为多种风格特异性先验。此外，SMP能够组合不同风格以合成原始数据集中未出现的新风格。通过可复用且模块化的运动先验，本方法生成的运动质量与最先进的对抗性模仿学习方法相当。我们在物理模拟人形角色的多样化控制任务中验证了SMP的有效性。视频演示见 https://youtu.be/ravlZJteS20