利用去噪参考避免用户生成内容压缩中的质量饱和 (Avoiding Quality Saturation in UGC Compression Using Denoised References)

Video-sharing platforms must re-encode large volumes of noisy user-generated content (UGC) to meet streaming demands. However, conventional codecs, which aim to minimize the mean squared error (MSE) between the compressed and input videos, can cause quality saturation (QS) when applied to UGC, i.e., increasing the bitrate preserves input artifacts without improving visual quality. A direct approach to solve this problem is to detect QS by repeatedly evaluating a non-reference metric (NRM) on videos compressed with multiple codec parameters, which is inefficient. In this paper, we re-frame UGC compression and QS detection from the lens of noisy source coding theory: rather than using a NRM, we compute the MSE with respect to the denoised UGC, which serves as an alternative reference (D-MSE). Unlike MSE measured between the UGC input and the compressed UGC, D-MSE saturates at non-zero values as bitrates increase, a phenomenon we term distortion saturation (DS). Since D-MSE can be computed at the block level in the transform domain, we can efficiently detect D-MSE without coding and decoding with various parameters. We propose two methods for DS detection: distortion saturation detection (DSD), which relies on an input-dependent threshold derived from the D-MSE of the input UGC, and rate-distortion saturation detection (RDSD), which estimates the Lagrangian at the saturation point using a low-complexity compression method. Both methods work as a pre-processing step that can help standard-compliant codecs avoid QS in UGC compression. Experiments with AVC show that preventing encoding in the saturation region, i.e., avoiding encoding at QPs that result in QS according to our methods, achieves BD-rate savings of 8%-20% across multiple different NRMs, compared to a naïve baseline that encodes at the given input QP while ignoring QS.

翻译：视频分享平台必须对大量含噪用户生成内容（UGC）进行重编码以满足流媒体需求。然而，旨在最小化压缩视频与输入视频之间均方误差（MSE）的传统编解码器在应用于UGC时可能导致质量饱和（QS），即增加比特率仅保留输入伪影而无法提升视觉质量。解决此问题的直接方法是通过在采用多种编解码参数压缩的视频上重复评估无参考指标（NRM）来检测QS，但效率低下。本文从噪声源编码理论的角度重新构建UGC压缩与QS检测：我们不再使用NRM，而是计算相对于去噪UGC的MSE（D-MSE），将其作为替代参考。与在UGC输入和压缩UGC之间测量的MSE不同，D-MSE会随比特率增加在非零值处饱和，我们将此现象称为失真饱和（DS）。由于D-MSE可在变换域中按块计算，我们无需使用多种参数进行编码和解码即可高效检测D-MSE。我们提出两种DS检测方法：失真饱和检测（DSD），其依赖于从输入UGC的D-MSE推导出的输入相关阈值；以及率失真饱和检测（RDSD），其通过低复杂度压缩方法估计饱和点处的拉格朗日量。两种方法均可作为预处理步骤，帮助符合标准的编解码器避免UGC压缩中的QS。使用AVC的实验表明，防止在饱和区域编码（即根据我们的方法避免在导致QS的量化参数下编码），与忽略QS、在给定输入量化参数下编码的简单基线相比，可在多种不同NRM上实现8%-20%的BD-rate节省。