We present CycleDance, a dance style transfer system to transform an existing motion clip in one dance style to a motion clip in another dance style while attempting to preserve motion context of the dance. Our method extends an existing CycleGAN architecture for modeling audio sequences and integrates multimodal transformer encoders to account for music context. We adopt sequence length-based curriculum learning to stabilize training. Our approach captures rich and long-term intra-relations between motion frames, which is a common challenge in motion transfer and synthesis work. We further introduce new metrics for gauging transfer strength and content preservation in the context of dance movements. We perform an extensive ablation study as well as a human study including 30 participants with 5 or more years of dance experience. The results demonstrate that CycleDance generates realistic movements with the target style, significantly outperforming the baseline CycleGAN on naturalness, transfer strength, and content preservation.
翻译:我们提出了一个名为CycleDance的舞蹈风格转移系统,可以将某一舞蹈风格的现有运动片段转换为另一舞蹈风格的运动片段,并尽可能地保留舞蹈的运动上下文。我们的方法扩展了现有的CycleGAN架构,用于模型化音频序列,并集成了多模态变换器编码器以考虑音乐背景。我们采用基于序列长度的课程学习来稳定训练。我们的方法可以捕捉运动帧之间丰富而长期的内部关系,这是运动转移和合成工作中的一个常见难题。我们还引入了新的度量标准,以评估舞蹈动作的转移强度和内容保留方面的表现。我们进行了广泛的削减研究以及包括30名拥有5年以上舞蹈经验的参与者的人类研究。结果表明,CycleDance可以生成具有目标风格的逼真动作,明显优于基准CycleGAN在自然度、转移强度和内容保留方面的表现。