Dance is an important human art form, but creating new dances can be difficult and time-consuming. In this work, we introduce Editable Dance GEneration (EDGE), a state-of-the-art method for editable dance generation that is capable of creating realistic, physically-plausible dances while remaining faithful to the input music. EDGE uses a transformer-based diffusion model paired with Jukebox, a strong music feature extractor, and confers powerful editing capabilities well-suited to dance, including joint-wise conditioning, and in-betweening. We introduce a new metric for physical plausibility, and evaluate dance quality generated by our method extensively through (1) multiple quantitative metrics on physical plausibility, beat alignment, and diversity benchmarks, and more importantly, (2) a large-scale user study, demonstrating a significant improvement over previous state-of-the-art methods. Qualitative samples from our model can be found at our website.
翻译:舞蹈是一种重要的人类艺术形式,但创造新的舞蹈可能困难而费时。 在这项工作中,我们引入了《可编辑的舞蹈杰作》(EDGE),这是编辑的舞蹈新一代最先进的方法,既能创造现实的、物理上可复制的舞蹈,又能忠实于投入音乐。EDGE使用一个基于变压器的传播模型,配以一个强大的音乐特色提取器,并赋予强大的编辑能力,适合舞蹈,包括合用调制和介质。我们引入了一个新的物理优美度标准,并广泛评价我们方法所产生的舞蹈质量,其方法是:(1) 物理可信赖性、节拍校准和多样性基准方面的多种定量指标,更重要的是,(2) 大规模用户研究,展示了以往最先进的方法的显著改进。 我们模型中的一些定性样本可以在我们的网站上找到。