The rise of video streaming applications has increased the demand for Video Quality Assessment (VQA). In 2016, Netflix introduced VMAF, a full reference VQA metric that strongly correlates with perceptual quality, but its computation is time-intensive. This paper proposes a Discrete Cosine Transform (DCT)-energy-based VQA with texture information fusion (VQ-TIF ) model for video streaming applications that predicts VMAF for the reconstructed video compared to the original video. VQ-TIF extracts Structural Similarity (SSIM) and spatio-temporal features of the frames from the original and reconstructed videos, fuses them using a Long Short-Term Memory (LSTM)-based model to estimate VMAF. Experimental results show that VQ-TIF estimates VMAF with a Pearson Correlation Coefficient (PCC) of 0.96 and a Mean Absolute Error (MAE) of 2.71, on average, compared to the ground truth VMAF scores. Additionally, VQ-TIF estimates VMAF at a rate of 9.14 times faster than the state-of-the-art VMAF implementation and a 89.44% reduction in the energy consumption, assuming an Ultra HD (2160p) display resolution.
翻译:视频流应用的增加增加了对视频质量评估(VQA)的需求。 2016年, Netflix引入了全参考VMAF(VMAF),这是一个与感知质量密切相关的全参考VMAF(VMAF)衡量标准,但其计算是时间密集型的。本文建议采用基于DDCT(DCT)的分辨共孔转换(DDCT)能源VQA(VQ-TIF)的VQA(VQ-TIF)模型,用于视频流应用的纹理信息聚合(VQQ-TIF)模型,该模型预测VMAF(VMAF)与原始视频相比对重建的视频(SSIM)结构相似性(SSIM)和框架的时空特征。 VQ-TIF(S-TIF)利用基于长期短期内存(LSTM)的模型将它们连接起来,以估计VMAF(VMAF)(VDA-F)的利用率为0.6,平均为2.71,与地面真理VMAF得分。 此外,VQ-TIF(VMA-F)估计VMA-F(VDAD-F)的实施速度比9.14)为快,假定VD(VD-F(VDA-40)的速率(VD-40)的速率为10)。</s>