Generative model-based motion prediction techniques have recently realized predicting controlled human motions, such as predicting multiple upper human body motions with similar lower-body motions. However, to achieve this, the state-of-the-art methods require either subsequently learning mapping functions to seek similar motions or training the model repetitively to enable control over the desired portion of body. In this paper, we propose a novel framework to learn disentangled representations for controllable human motion prediction. Our network involves a conditional variational auto-encoder (CVAE) architecture to model full-body human motion, and an extra CVAE path to learn only the corresponding partial-body (e.g., lower-body) motion. Specifically, the inductive bias imposed by the extra CVAE path encourages two latent variables in two paths to respectively govern separate representations for each partial-body motion. With a single training, our model is able to provide two types of controls for the generated human motions: (i) strictly controlling one portion of human body and (ii) adaptively controlling the other portion, by sampling from a pair of latent spaces. Additionally, we extend and adapt a sampling strategy to our trained model to diversify the controllable predictions. Our framework also potentially allows new forms of control by flexibly customizing the input for the extra CVAE path. Extensive experimental results and ablation studies demonstrate that our approach is capable of predicting state-of-the-art controllable human motions both qualitatively and quantitatively.
翻译:基于模型的模型模拟运动预测技术最近实现了对受控人类动作的预测,例如预测多重上方人体动作和类似的下体动作。然而,为了实现这一点,最先进的方法要求随后学习映射功能,以寻求类似的动作,或重复培训模型,以便能够控制所需的身体部分。在本文件中,我们提议了一个新框架,以学习对可控人类动作的预测进行分解的表达方式。我们的网络包括一个有条件的变异自动编码(CVAE)结构,以模拟人的全面动作,以及一个额外的CVAE路径,只学习相应的部分体(例如,低体)运动。具体地说,额外CVAE路径强加的感知偏差鼓励两个潜在的变数,分别管理每个部分动作的单独陈述。在经过一次培训后,我们的模型能够提供两种类型的控制,即:(一)严格控制人体的一部分,和(二)调整控制另一部分部分的适应性控制方法,从两组隐性空间取样。此外,我们扩展并调整了我们经过培训的C型和试制的试算方法,使我们的试算式的试算方法能够将我们的试制为新的试制模式。