Recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from the lossy JPEG/MPEG compression technique. Most of them are built upon the processing made on the spatial domain. In this work, we propose a MPEG video decoder that is purely based on the frequency-to-frequency domain: it reads the quantized DCT coefficients received from a low-quality I-frames bitstream and, using a deep learning-based model, predicts the missing coefficients in order to recompose the same frames with enhanced quality. In experiments with a video dataset, our best model was able to improve from frames with quantized DCT coefficients corresponding to a Quality Factor (QF) of 10 to enhanced quality frames with QF slightly near to 20.
翻译:最近的一些工程成功地应用了某些类型的革命神经网络(CNNs)来减少由于丢失JPEG/MPEG压缩技术造成的明显扭曲,其中多数是建立在空间域的处理基础上的。在这项工作中,我们提出一个纯粹基于频频至频域的MPEG视频解码器:它读取从低质量I-框架位流获得的量化的DCT系数,并使用深层次的学习模型预测缺失的系数,以便用提高的质量重新组合相同的框架。在用视频数据集进行实验时,我们的最佳模型能够从一个与10-10个质因数相对应的量化的DCT系数(QF)的框架中改进质量框架,而QF则略接近20。