Synthesizing realistic animations of humans, animals, and even imaginary creatures, has long been a goal for artists and computer graphics professionals. Compared to the imaging domain, which is rich with large available datasets, the number of data instances for the motion domain is limited, particularly for the animation of animals and exotic creatures (e.g., dragons), which have unique skeletons and motion patterns. In this work, we present a Single Motion Diffusion Model, dubbed SinMDM, a model designed to learn the internal motifs of a single motion sequence with arbitrary topology and synthesize motions of arbitrary length that are faithful to them. We harness the power of diffusion models and present a denoising network designed specifically for the task of learning from a single input motion. Our transformer-based architecture avoids overfitting by using local attention layers that narrow the receptive field, and encourages motion diversity by using relative positional embedding. SinMDM can be applied in a variety of contexts, including spatial and temporal in-betweening, motion expansion, style transfer, and crowd animation. Our results show that SinMDM outperforms existing methods both in quality and time-space efficiency. Moreover, while current approaches require additional training for different applications, our work facilitates these applications at inference time. Our code and trained models are available at https://sinmdm.github.io/SinMDM-page.
翻译:合成人类、动物、甚至想象中的动物的现实动画,长期以来一直是艺术家和计算机图形专业人员的目标。与成像领域相比,运动领域的数据实例数量有限,特别是动物和外来动物(如龙)的动画,它们有着独特的骨架和运动模式。在这项工作中,我们提出了一个单一运动扩散模型,称为SinMDMMM,该模型旨在学习单一运动序列的内部图案,带有任意的表层学和任意长度的综合动作,这是他们忠实的。我们利用扩散模型的力量,并展示一个专门为从单一输入运动中学习任务而设计的解密网络。我们基于变压器的架构避免因使用缩小容容场的当地关注层而过度适应,并通过相对定位嵌入鼓励运动多样性。 SinMDM可以应用于各种环境,包括时空介、运动扩展、风格转移和人群动动动画。我们的成果显示,SimMDM公司在现有的时间/空间应用中,在经过培训的当前时间/空间应用中需要更多的方法。