In the past decades, lots of progress have been done in the video compression field including traditional video codec and learning-based video codec. However, few studies focus on using preprocessing techniques to improve the rate-distortion performance. In this paper, we propose a rate-perception optimized preprocessing (RPP) method. We first introduce an adaptive Discrete Cosine Transform loss function which can save the bitrate and keep essential high frequency components as well. Furthermore, we also combine several state-of-the-art techniques from low-level vision fields into our approach, such as the high-order degradation model, efficient lightweight network design, and Image Quality Assessment model. By jointly using these powerful techniques, our RPP approach can achieve on average, 16.27% bitrate saving with different video encoders like AVC, HEVC, and VVC under multiple quality metrics. In the deployment stage, our RPP method is very simple and efficient which is not required any changes in the setting of video encoding, streaming, and decoding. Each input frame only needs to make a single pass through RPP before sending into video encoders. In addition, in our subjective visual quality test, 87% of users think videos with RPP are better or equal to videos by only using the codec to compress, while these videos with RPP save about 12% bitrate on average. Our RPP framework has been integrated into the production environment of our video transcoding services which serve millions of users every day.
翻译:在过去几十年里,在视频压缩领域取得了许多进展,包括传统视频编码和基于学习的视频编码。然而,很少有研究侧重于使用预处理技术来提高率扭曲性能。在本文中,我们提议了一种节率感优化预处理方法。我们首先引入了适应性分解共振变换功能,可以节省比特率并保持基本的高频组件。此外,我们还将一些从低级视觉领域到最先进的技术纳入我们的方法,例如高阶降解模型、高效轻量网络设计和图像质量评估模型。通过使用这些强大的技术,我们的RPP方法可以平均实现16.27%的比特率储蓄,使用不同的视频编码,如AVC、HIVC和VC。在部署阶段,我们的RPP方法非常简单、高效,不需要在视频编码设置、流出和解码方面作任何改变。每个输入框架只需通过百万个用户一次性的RPPS,然后通过每部的 RPP 将每部的每部的 RP 测试一次比特,然后将每部的 RPP 输入到每部的每部级的 RP 的每部的每部视频编码。