We introduce a deep generative model for image sequences that reliably factorise the latent space into content and motion variables. To model the diverse dynamics, we split the motion space into subspaces and introduce a unique Hamiltonian operator for each subspace. The Hamiltonian formulation provides reversible dynamics that constrain the evolution of the motion path along the low-dimensional manifold and conserves learnt invariant properties. The explicit split of the motion space decomposes the Hamiltonian into symmetry groups and gives long-term separability of the dynamics. This split also means we can learn content representations that are easy to interpret and control. We demonstrate the utility of our model by swapping the motion of two videos, generating long term sequences of various actions from a given image, unconditional sequence generation and image rotations.
翻译:我们引入了将潜在空间可靠地纳入内容和运动变量的图像序列的深基因模型。 为了模拟各种动态, 我们将运动空间分为子空间, 并为每个子空间引入一个独特的汉密尔顿操作器。 汉密尔顿配方提供了可逆的动态模型, 制约了运动路径沿低维元体的演进, 并保存了所学的无差异特性。 运动空间的明显分割将汉密尔顿人分解成对称组, 并给出了动态的长期分离性。 这种分割还意味着我们可以学习易于解释和控制的内容表达方式。 我们通过转换两部视频的动作, 从特定图像、 无条件的序列生成和图像旋转中产生各种行动的长期序列, 来展示我们的模型的实用性。