Recently, flow-based frame interpolation methods have achieved great success by first modeling optical flow between target and input frames, and then building synthesis network for target frame generation. However, above cascaded architecture can lead to large model size and inference delay, hindering them from mobile and real-time applications. To solve this problem, we propose a novel Progressive Motion Context Refine Network (PMCRNet) to predict motion fields and image context jointly for higher efficiency. Different from others that directly synthesize target frame from deep feature, we explore to simplify frame interpolation task by borrowing existing texture from adjacent input frames, which means that decoder in each pyramid level of our PMCRNet only needs to update easier intermediate optical flow, occlusion merge mask and image residual. Moreover, we introduce a new annealed multi-scale reconstruction loss to better guide the learning process of this efficient PMCRNet. Experiments on multiple benchmarks show that proposed approaches not only achieve favorable quantitative and qualitative results but also reduces current model size and running time significantly.
翻译:最近,基于流程的框架内插方法取得了巨大成功,首先对目标框架和输入框架之间的光流进行了建模,然后为目标框架的生成建立了合成网络。然而,以上级联结构可能导致模型规模大和推推推延迟,阻碍它们使用移动和实时应用程序。为了解决这个问题,我们提议建立一个新的渐进式移动背景内嵌网络(PMCRNet),以共同预测运动场和图像背景,从而提高效率。不同于直接综合目标框架的深层特征的其他方法,我们探索通过借用邻近输入框架的现有纹理来简化框架内插任务,这意味着我们PMCRNet的金字塔层的分解器只需更新较容易的中间光学流、加封遮罩和图像残存。此外,我们引入了新的隐性多尺度重建损失,以更好地指导高效的PPCRNet的学习过程。关于多个基准的实验表明,拟议的方法不仅能够取得有利的定量和定性结果,而且还显著地减少当前模型的规模和运行时间。