We propose a novel neural representation for videos (NeRV) which encodes videos in neural networks. Unlike conventional representations that treat videos as frame sequences, we represent videos as neural networks taking frame index as input. Given a frame index, NeRV outputs the corresponding RGB image. Video encoding in NeRV is simply fitting a neural network to video frames and decoding process is a simple feedforward operation. As an image-wise implicit representation, NeRV output the whole image and shows great efficiency compared to pixel-wise implicit representation, improving the encoding speed by 25x to 70x, the decoding speed by 38x to 132x, while achieving better video quality. With such a representation, we can treat videos as neural networks, simplifying several video-related tasks. For example, conventional video compression methods are restricted by a long and complex pipeline, specifically designed for the task. In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H.264, HEVC \etc). Besides compression, we demonstrate the generalization of NeRV for video denoising. The source code and pre-trained model can be found at https://github.com/haochen-rye/NeRV.git.
翻译:我们建议对视频进行新型神经显示( NeRV), 将视频编码为神经网络中的视频。 与将视频作为框架序列处理的传统表示方式不同, 我们将视频作为神经网络, 将框架索引作为输入。 根据框架索引, NeRV 输出相应的 RGB 图像。 NeRV 的视频编码只是将神经网络与视频框架和解码过程相匹配, 是一种简单的进化前传操作。 与 NeRV 相比, 以图像为向前传的隐含表示方式输出整个图像, 并显示与像素隐含表示方式相比, 将编码速度提高25x至70x, 以38x至132x 的解码速度, 并实现更好的视频质量。 有了这样的表示方式, 我们可以将视频作为神经网络处理, 简化一些与视频有关的任务。 例如, 常规的视频压缩方法受长而复杂的管道限制, 专门为任务设计的。 与 NERV 相比, 我们可以使用任何神经网络压缩方法作为视频压缩的代言,, 实现基于框架的图像压缩方法的类似性功能( HEVRC- Rchen- nex- ) 的模拟/ degiscrevic) 。 。 在常规的代码源中可以找到 。