Stochastic human motion prediction (HMP) has generally been tackled with generative adversarial networks and variational autoencoders. Most prior works aim at predicting highly diverse movements in terms of the skeleton joints' dispersion. This has led to methods predicting fast and motion-divergent movements, which are often unrealistic and incoherent with past motion. Such methods also neglect contexts that need to anticipate diverse low-range behaviors, or actions, with subtle joint displacements. To address these issues, we present BeLFusion, a model that, for the first time, leverages latent diffusion models in HMP to sample from a latent space where behavior is disentangled from pose and motion. As a result, diversity is encouraged from a behavioral perspective. Thanks to our behavior coupler's ability to transfer sampled behavior to ongoing motion, BeLFusion's predictions display a variety of behaviors that are significantly more realistic than the state of the art. To support it, we introduce two metrics, the Area of the Cumulative Motion Distribution, and the Average Pairwise Distance Error, which are correlated to our definition of realism according to a qualitative study with 126 participants. Finally, we prove BeLFusion's generalization power in a new cross-dataset scenario for stochastic HMP.
翻译:人类运动的触摸性预测(HMP)一般都是通过基因对抗网络和变异自动转换器来处理的。大多数先前的工作都旨在预测骨骼关节分散的高度多样化运动。这导致了预测快速和运动分散运动的方法,这些方法往往不切实际,与过去运动不相容。这些方法还忽视了需要预测各种低程行为或行动的环境,以及微妙的联合迁移。为了解决这些问题,我们介绍了Belfusion,这是一个模型,首次利用HMP的潜在扩散模型从一个潜在空间的样本,在那里,行为与表面和运动脱钩。结果,多样性从行为角度受到鼓励。由于我们的行为组合者将抽样行为转移到当前运动的能力,Belfusion的预测显示了一系列比艺术状态更现实得多的行为。为了支持它,我们引入了两种衡量标准,即累积运动分布区和平均帕西德利德距离错误,这与我们真实实力定义的模型有关,最终要通过定性模型来证明我们真实实力定义。