Most Video Super-Resolution (VSR) methods enhance a video reference frame by aligning its neighboring frames and mining information on these frames. Recently, deformable alignment has drawn extensive attention in VSR community for its remarkable performance, which can adaptively align neighboring frames with the reference one. However, we experimentally find that deformable alignment methods still suffer from fast motion due to locally loss-driven offset prediction and lack explicit motion constraints. Hence, we propose a Matching-based Flow Estimation (MFE) module to conduct global semantic feature matching and estimate optical flow as coarse offset for each location. And a Flow-guided Deformable Module (FDM) is proposed to integrate optical flow into deformable convolution. The FDM uses the optical flow to warp the neighboring frames at first. And then, the warped neighboring frames and the reference one are used to predict a set of fine offsets for each coarse offset. In general, we propose an end-to-end deep network called Flow-guided Deformable Alignment Network (FDAN), which reaches the state-of-the-art performance on two benchmark datasets while is still competitive in computation and memory consumption.
翻译:多数视频超级分辨率( VSR) 方法通过调整其相邻框架和这些框架的开采信息来增强视频参考框架。 最近, 变形调整吸引了VSR社群对其显著性能的广泛关注, 这可以适应性地将相邻框架与参照框架相匹配。 然而, 我们实验发现, 变形调整方法仍然会因当地损失驱动的抵消预测和缺乏明确的动作限制而快速运动而受到影响。 因此, 我们提出一个基于匹配的流动估计模块, 以进行全球语义特征匹配, 并估计光学流作为每个位置的粗略抵消。 并提议一个流动引导的变形模块( FDM) 将光学流整合到可变化的变形演变中。 FDM 首先使用光流来扭曲相邻框架。 然后, 扭曲的相邻框架和参考框架被用来预测每粗度抵消的一组微的微抵消。 一般来说, 我们提议建立一个端至端深端网络, 称为流动引导变形调整网络(FDAN),, 这个网络在两种基准数据计算中仍具有竞争力的存储功能, 。