Versatile Video Coding (VVC), as the latest standard, significantly improves the coding efficiency over its ancestor standard High Efficiency Video Coding (HEVC), but at the expense of sharply increased complexity. In VVC, the quad-tree plus multi-type tree (QTMT) structure of coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-distortion (RD) optimization. Instead of the brute-force QTMT search, this paper proposes a deep learning approach to predict the QTMT-based CU partition, for drastically accelerating the encoding process of intra-mode VVC. First, we establish a large-scale database containing sufficient CU partition patterns with diverse video content, which can facilitate the data-driven VVC complexity reduction. Next, we propose a multi-stage exit CNN (MSE-CNN) model with an early-exit mechanism to determine the CU partition, in accord with the flexible QTMT structure at multiple stages. Then, we design an adaptive loss function for training the MSE-CNN model, synthesizing both the uncertain number of split modes and the target on minimized RD cost. Finally, a multi-threshold decision scheme is developed, achieving desirable trade-off between complexity and RD performance. Experimental results demonstrate that our approach can reduce the encoding time of VVC by 44.65%-66.88% with the negligible Bj{\o}ntegaard delta bit-rate (BD-BR) of 1.322%-3.188%, which significantly outperforms other state-of-the-art approaches.
翻译:VVC是最新标准,它大大提高了与其祖先标准高效率视频编码(HEVC)相比的编码效率,但以急剧增加的复杂性为代价。在VVC中,四树加多类型树(QTMT)的编码单位(CU)分区结构占编码时间的97%以上,这是由于对循环率扭曲(RDC)优化的粗力搜索。本文建议采用粗力搜索,而不是粗力QTMT搜索,采用深入的学习方法预测基于QTMC的CU(HEVC)码码分割,以大大加快VVVC的编码进程。首先,我们建立了一个包含足够多视频内容的CU分区结构(QTM),这可以促进以数据驱动的VCFC复杂性降低97%。下一步,我们建议采用多阶段退出的CNNM(MSE-CNN)模式,与灵活的QNTT结构在多个阶段大大降低了C-MT差差分法的计算方法。然后,我们设计了一个具有不确定性的计算模式,用于培训MSEEO-RD成本的模型,最终将MSE-dealdeal developlection dal 。