The emergence of neural networks has revolutionized the field of motion synthesis. Yet, learning to unconditionally synthesize motions from a given distribution remains challenging, especially when the motions are highly diverse. In this work, we present MoDi -- a generative model trained in an unsupervised setting from an extremely diverse, unstructured and unlabeled dataset. During inference, MoDi can synthesize high-quality, diverse motions. Despite the lack of any structure in the dataset, our model yields a well-behaved and highly structured latent space, which can be semantically clustered, constituting a strong motion prior that facilitates various applications including semantic editing and crowd simulation. In addition, we present an encoder that inverts real motions into MoDi's natural motion manifold, issuing solutions to various ill-posed challenges such as completion from prefix and spatial editing. Our qualitative and quantitative experiments achieve state-of-the-art results that outperform recent SOTA techniques. Code and trained models are available at https://sigal-raab.github.io/MoDi.
翻译:神经网络的出现使运动合成领域发生了革命性的变化。然而,学会无条件综合某一分配的动作仍然是个挑战,特别是当运动高度多样化时。在这项工作中,我们介绍了Modi -- -- 一个在无监督的环境中从一个极为多样、无结构和未贴标签的数据集中训练的基因模型。在推断过程中,Modi可以综合高质量的、多样的动作。尽管数据集中没有任何结构,但我们的模型产生一个井然有序和结构严密的潜伏空间,可以进行语义分类,构成一种强大的运动,在各种应用之前,包括语义编辑和人群模拟。此外,我们展示了一个将真实动作转换到莫迪自然运动的编码器,对各种错误的挑战,如前缀和空间编辑的完成,提出解决办法。我们的定性和定量实验取得了超越最近SOTA技术的状态-艺术结果。在https://sigal-raab.github.io/Modii中可以找到代码和经过训练的模型。