The recurrent structure is a prevalent framework for the task of video super-resolution, which models the temporal dependency between frames via hidden states. When applied to real-world scenarios with unknown and complex degradations, hidden states tend to contain unpleasant artifacts and propagate them to restored frames. In this circumstance, our analyses show that such artifacts can be largely alleviated when the hidden state is replaced with a cleaner counterpart. Based on the observations, we propose a Hidden State Attention (HSA) module to mitigate artifacts in real-world video super-resolution. Specifically, we first adopt various cheap filters to produce a hidden state pool. For example, Gaussian blur filters are for smoothing artifacts while sharpening filters are for enhancing details. To aggregate a new hidden state that contains fewer artifacts from the hidden state pool, we devise a Selective Cross Attention (SCA) module, in which the attention between input features and each hidden state is calculated. Equipped with HSA, our proposed method, namely FastRealVSR, is able to achieve 2x speedup while obtaining better performance than Real-BasicVSR. Codes will be available at https://github.com/TencentARC/FastRealVSR
翻译:常规结构是视频超级分辨率任务的一个普遍框架, 它通过隐藏状态来模拟框架之间的时间依赖性。 当应用到具有未知和复杂退化的真实世界情景时, 隐藏状态往往含有不愉快的文物并将其传播到恢复框架。 在此情况下, 我们的分析显示, 当隐藏状态被更清洁的对应方取代时, 这些文物可以在很大程度上减轻。 根据观察结果, 我们提议了一个隐藏状态关注模块, 以在真实世界视频超级分辨率中减少艺术品。 具体地说, 我们首先采用各种廉价过滤器来生成隐藏状态库。 例如, 高山模糊过滤器用于平滑工艺品, 而精细化过滤器则用于强化细节。 要将隐藏状态中含有较少文物的新隐藏状态集中起来, 我们设计了一个选择性交叉注意模块, 用于计算输入特征和隐藏状态之间的注意。 我们提议的方法, 即快速RealVSR, 能够实现2x速度, 同时取得比 Rema- 基本VSR更好的性能。 https://Regast/FARC/ regithrubcoms 。