We present a novel simple yet effective algorithm for motion-based video frame interpolation. Existing motion-based interpolation methods typically rely on a pre-trained optical flow model or a U-Net based pyramid network for motion estimation, which either suffer from large model size or limited capacity in handling complex and large motion cases. In this work, by carefully integrating intermediateoriented forward-warping, lightweight feature encoder, and correlation volume into a pyramid recurrent framework, we derive a compact model to simultaneously estimate the bidirectional motion between input frames. It is 15 times smaller in size than PWC-Net, yet enables more reliable and flexible handling of challenging motion cases. Based on estimated bi-directional motion, we forward-warp input frames and their context features to intermediate frame, and employ a synthesis network to estimate the intermediate frame from warped representations. Our method achieves excellent performance on a broad range of video frame interpolation benchmarks. Code will be available soon.
翻译:我们为基于运动的视频框架的内插提供了一个新的简单而有效的算法。现有的基于运动的内插方法通常依赖于预先训练的光流模型或基于U-Net的移动估计金字塔网络,这些模型规模大,或处理复杂和大动作案例的能力有限。在这项工作中,我们通过仔细地将中向前向、轻量特效编码器和相关量纳入一个金字塔的经常性框架,得出一个压缩模型,同时估计输入框架之间的双向运动。它比PWC-Net小15倍,但能够更可靠和灵活地处理具有挑战性的运动案件。根据估计的双向运动,我们前向-战争输入框架及其在中间框架的上下文特征,并使用一个综合网络来根据扭曲的演示来估计中间框架。我们的方法在广泛的视频框架内插图基准上取得了极好的业绩,不久就会有代码。