By 2022, we expect video traffic to reach 82% of the total internet traffic. Undoubtedly, the abundance of video-driven applications will likely lead internet video traffic percentage to a further increase in the near future, enabled by associate advances in video devices' capabilities. In response to this ever-growing demand, the Alliance for Open Media (AOM) and the Joint Video Experts Team (JVET) have demonstrated strong and renewed interest in developing new video codecs. In the fast-changing video codecs' landscape, there is thus, a genuine need to develop adaptive methods that can be universally applied to different codecs. In this study, we formulate video encoding as a multi-objective optimization process where video quality (as a function of VMAF and PSNR), bitrate demands, and encoding rate (in encoded frames per second) are jointly optimized, going beyond the standard video encoding approaches that focus on rate control targeting specific bandwidths. More specifically, we create a dense video encoding space (offline) and then employ regression to generate forward prediction models for each one of the afore-described optimization objectives, using only Pareto-optimal points. We demonstrate our adaptive video encoding approach that leverages the generated forward prediction models that qualify for real-time adaptation using different codecs (e.g., SVT-AV1 and x265) for a variety of video datasets and resolutions. To motivate our approach and establish the promise for future fast VVC encoders, we also perform a comparative performance evaluation using both subjective and objective metrics and report on bitrate savings among all possible pairs between VVC, SVT-AV1, x265, and VP9 codecs.
翻译:2022年,我们预计视频流量将达到互联网流量总量的82%。 毫无疑问,大量视频驱动应用程序将在不久的将来导致互联网视频流量百分比进一步上升,这有可能在视频设备能力的关联性进步下,在近期内导致互联网视频流量百分比进一步上升。 针对这一不断增长的需求,开放媒体联盟(AOM)和联合视频专家小组(JVET)展示了对开发新视频代码的强烈和新的兴趣。因此,在快速变化的视频编码器的场景中,确实需要开发适应性方法,可以普遍应用于不同的编码器。在本研究中,我们将视频编码作为一种多目标优化程序,使视频质量(作为VMAF和PSNR的功能)、比特标准要求和编码率(以每秒编码框架)共同优化,超越了侧重于特定带宽的节率控制的标准视频编码方法。更具体地说,我们创造了一个密集的视频编码空间(offline),然后利用回溯式模型,为每个预设的具体目标生成前瞻性预测模型,只使用Pareto-toC的准确度方法。我们用SBlate-deal-ral-deal-deal-dealScial-deal-dealisal 和Sildal-devial-deal lagildal 和Sildals