Diffusion-based generative models have recently emerged as powerful solutions for high-quality synthesis in multiple domains. Leveraging the bidirectional Markov chains, diffusion probabilistic models generate samples by inferring the reversed Markov chain based on the learned distribution mapping at the forward diffusion process. In this work, we propose Modiff, a conditional paradigm that benefits from the denoising diffusion probabilistic model (DDPM) to tackle the problem of realistic and diverse action-conditioned 3D skeleton-based motion generation. We are a pioneering attempt that uses DDPM to synthesize a variable number of motion sequences conditioned on a categorical action. We evaluate our approach on the large-scale NTU RGB+D dataset and show improvements over state-of-the-art motion generation methods.
翻译:基于扩散的生成模型最近在多个领域中崭露头角,成为高质量合成的强大解决方案。借助双向马尔可夫链,扩散性概率模型通过基于前向扩散过程上的学习分布映射来推断反向马尔可夫链,从而生成样本。在这项工作中,我们提出了Modiff,一种条件范式,利用去噪扩散概率模型(DDPM)解决现实且多样化的基于三维骨骼的动作调节运动生成问题。我们是直接使用DDPM合成变量数量运动序列来调节分类动作的开创性尝试之一。我们在大型NTU RGB+D数据集上进行了评估,并展示了比最先进的运动生成方法更好的效果。