Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.
翻译:现有的基于深层次学习的端到端图像/视频编码(DLEC)结构大多是设计为非次抽样的 RGB 颜色格式设计的。然而,为了实现高级编码性能,许多先进的基于艺术的块状压缩标准,如高效率视频编码(HEVC/H.265)和Versatile视频编码(VVVC/H.266),主要设计为YUV 4:0格式,其中U和V的组件通过考虑人类视觉系统进行分抽样检查。本文调查了各种DLEC设计以支持YUV 4:2:0格式,在共同评价框架内将其性能与HEVC和VVC标准的主要特征进行比较,从而调查了这些设计以支持YUV:2:0格式。此外,为了提高YUV4:2:0数据编码(VVVV:2:H)的效率,提出了一个新的改造网络结构结构。关于YUV4:2:0数据集的实验结果表明,拟议的结构大大超出为RGB格式设计的现有结构的天性扩展,并实现了大约10%的平均BD战略改进。