Physics-based character animation has seen significant advances in recent years with the adoption of Deep Reinforcement Learning (DRL). However, DRL-based learning methods are usually computationally expensive and their performance crucially depends on the choice of hyperparameters. Tuning hyperparameters for these methods often requires repetitive training of control policies, which is even more computationally prohibitive. In this work, we propose a novel Curriculum-based Multi-Fidelity Bayesian Optimization framework (CMFBO) for efficient hyperparameter optimization of DRL-based character control systems. Using curriculum-based task difficulty as fidelity criterion, our method improves searching efficiency by gradually pruning search space through evaluation on easier motor skill tasks. We evaluate our method on two physics-based character control tasks: character morphology optimization and hyperparameter tuning of DeepMimic. Our algorithm significantly outperforms state-of-the-art hyperparameter optimization methods applicable for physics-based character animation. In particular, we show that hyperparameters optimized through our algorithm result in at least 5x efficiency gain comparing to author-released settings in DeepMimic.
翻译:近些年来,以物理为基础的性格动画在采用深强化学习(DRL)后取得了显著进展。然而,基于DRL的学习方法通常在计算上费用昂贵,其性能也在很大程度上取决于超参数的选择。为这些方法的超参数测试往往需要反复进行控制政策培训,这在计算上更加令人望而却步。在这项工作中,我们提出了一个新的基于课程的多功能巴伊西亚最佳优化框架(CMFBO),用于高效优化基于DRL的性格控制系统。使用基于课程的工作难度作为忠诚标准,我们的方法通过对较容易的机动技能任务进行评估,逐步调整搜索空间,从而提高了搜索效率。我们评估了我们关于两种基于物理的性格控制任务的方法:特征形态优化和深 Mimic 的超参数调整。我们的算法大大优于适用于基于物理学的性格动动画的状态的超光度优化方法。我们特别表明,超参数通过我们的算法结果优化了我们至少5x效率的提高,与深 Mimicrimic 的作者-租赁环境相比。