We propose a novel framework to produce cartoon videos by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. The key idea of the proposed approach is to estimate the dense cross-domain correspondence between the sketch and cartoon video frames, and employ a blending module with occlusion estimation to synthesize the middle frame guided by the sketch. After that, the input frames and the synthetic frame equipped with established correspondence are fed into an arbitrary-time frame interpolation pipeline to generate and refine additional inbetween frames. Finally, a module to preserve temporal consistency is employed. Compared to common frame interpolation methods, our approach can address frames with relatively large motion and also has the flexibility to enable users to control the generated video sequences by editing the sketch guidance. By explicitly considering the correspondence between frames and the sketch, we can achieve higher quality results than other image synthesis methods. Our results show that our system generalizes well to different movie frames, achieving better results than existing solutions.
翻译:我们提出一个新的框架来制作卡通视频,从两个输入键盘中获取颜色信息,同时遵循用户草图所引导的动画动作。 提议方法的关键理念是估计草图和卡通视频框之间密集的跨界通信,并使用一个混合模块来综合以草图指导的中间框架。 之后, 输入框架和配有固定通信的合成框架被输入一个任意时间框架的内插管道, 以生成和完善介质中的额外信息。 最后, 采用了一个维护时间一致性的模块。 与共同框架的内插方法相比, 我们的方法可以以相对大的运动处理框架, 并且具有灵活性, 使用户能够通过修改草图指导来控制生成的视频序列。 通过明确考虑框架和草图之间的对应, 我们可以实现比其他图像合成方法更高的质量结果。 我们的结果显示, 我们的系统将不同电影框的广化效果优于现有解决方案。