We propose a novel framework for video inpainting by adopting an internal learning strategy. Unlike previous methods that use optical flow for cross-frame context propagation to inpaint unknown regions, we show that this can be achieved implicitly by fitting a convolutional neural network to known regions. Moreover, to handle challenging sequences with ambiguous backgrounds or long-term occlusion, we design two regularization terms to preserve high-frequency details and long-term temporal consistency. Extensive experiments on the DAVIS dataset demonstrate that the proposed method achieves state-of-the-art inpainting quality quantitatively and qualitatively. We further extend the proposed method to another challenging task: learning to remove an object from a video giving a single object mask in only one frame in a 4K video.
翻译:我们建议采用内部学习战略,为视频绘画提供一个新的框架。与以往使用光学流将跨框架背景传播到未知区域的方法不同,我们显示,通过将进化神经网络安装到已知区域,可以隐含地做到这一点。此外,为了处理背景不明或长期隔离的具有挑战性的序列,我们设计了两个正规化术语,以保存高频细节和长期时间一致性。关于DAVIS数据集的广泛实验表明,拟议方法在数量和质量上都达到了最先进的质量绘画水平。我们进一步将拟议方法扩大到另一个具有挑战性的任务:学习从视频中删除一个对象,在4K视频中仅用一个框提供单一物体面具。