We introduce the task of human motion unlearning to prevent the synthesis of toxic animations while preserving the general text-to-motion generative performance. Unlearning toxic motions is challenging as those can be generated from explicit text prompts and from implicit toxic combinations of safe motions (e.g., "kicking" is "loading and swinging a leg"). We propose the first motion unlearning benchmark by filtering toxic motions from the large and recent text-to-motion datasets of HumanML3D and Motion-X. We propose baselines, by adapting state-of-the-art image unlearning techniques to process spatio-temporal signals. Finally, we propose a novel motion unlearning model based on Latent Code Replacement, which we dub LCR. LCR is training-free and suitable to the discrete latent spaces of state-of-the-art text-to-motion diffusion models. LCR is simple and consistently outperforms baselines qualitatively and quantitatively. Project page: https://www.pinlab.org/hmu.
翻译:本文提出人体运动遗忘学习任务,旨在防止生成有害动画的同时保持通用文本到运动生成性能。遗忘有害运动具有挑战性,因为这些运动既可能源于显式文本提示,也可能来自安全运动的隐式有害组合(例如“踢腿”可视为“蓄力与摆腿”的结合)。我们通过从大规模前沿文本到运动数据集HumanML3D和Motion-X中筛选有害运动,构建了首个运动遗忘学习基准。我们通过将先进的图像遗忘技术适配于时空信号处理,提出了基线方法。最后,我们提出一种基于潜在代码替换的新型运动遗忘模型LCR。该模型无需训练,适用于当前最先进文本到运动扩散模型的离散潜在空间。LCR方法简洁高效,在定性与定量评估中均持续优于基线方法。项目页面:https://www.pinlab.org/hmu。