In-betweening is a technique for generating transitions given initial and target character states. The majority of existing works require multiple (often $>$10) frames as input, which are not always accessible. Our work deals with a focused yet challenging problem: to generate the transition when given exactly two frames (only the first and last). To cope with this challenging scenario, we implement our bi-directional scheme which generates forward and backward transitions from the start and end frames with two adversarial autoregressive networks, and stitches them in the middle of the transition where there is no strict ground truth. The autoregressive networks based on conditional variational autoencoders (CVAE) are optimized by searching for a pair of optimal latent codes that minimize a novel stitching loss between their outputs. Results show that our method achieves higher motion quality and more diverse results than existing methods on both the LaFAN1 and Human3.6m datasets.
翻译:动作插值是一种生成过渡动画的技术,可以根据初始和目标角色状态生成中间态。现有大多数方法需要多个(往往$>$10)帧作为输入,但这并不始终可行。我们处理的是一个专注而具有挑战性的问题:在仅提供前后两个关键帧时生成过渡帧。为了处理这个挑战性情况,我们实现了双向方案,从起始和终止帧生成前向和后向过渡帧,使用两个对抗性自回归网络,并在过渡的中间部分互补拼接两个方向。基于条件变分自动编码器(CVAE)的自回归网络通过搜索一对最优潜变量编码来最小化它们的输出之间的新型拼接损失进行优化。实验结果表明,我们的方法在LaFAN1和Human3.6m数据集上比现有方法实现了更高的运动质量和更多样化的结果。