In recent years, video data has dominated internet traffic and becomes one of the major data formats. With the emerging 5G and internet of things (IoT) technologies, more and more videos are generated by edge devices, sent across networks, and consumed by machines. The volume of video consumed by machine is exceeding the volume of video consumed by humans. Machine vision tasks include object detection, segmentation, tracking, and other machine-based applications, which are quite different from those for human consumption. On the other hand, due to large volumes of video data, it is essential to compress video before transmission. Thus, efficient video coding for machines (VCM) has become an important topic in academia and industry. In July 2019, the international standardization organization, i.e., MPEG, created an Ad-Hoc group named VCM to study the requirements for potential standardization work. In this paper, we will address the recent development activities in the MPEG VCM group. Specifically, we will first provide an overview of the MPEG VCM group including use cases, requirements, processing pipelines, plan for potential VCM standards, followed by the evaluation framework including machine-vision tasks, dataset, evaluation metrics, and anchor generation. We then introduce technology solutions proposed so far and discuss the recent responses to the Call for Evidence issued by MPEG VCM group.
翻译:近年来,视频数据在互联网交通中占主导地位,成为主要数据格式之一。随着5G和5G(互联网)技术的出现,越来越多的视频由边缘设备产生,通过网络传送,由机器消耗。机器的视频消耗量超过了人类消耗的视频量。机器的视频任务包括物体探测、分解、跟踪和其他机器应用,这与人类消费完全不同。另一方面,由于视频数据数量巨大,在传输前必须压缩视频。因此,机器的高效视频编码已成为学术界和工业界的一个重要话题。2019年7月,国际标准化组织,即MPEG,创建了一个名为VCM的A-Hoc小组,研究潜在标准化工作的要求。在本文件中,我们将讨论MPEG VCM集团最近的发展活动。具体地说,我们将首先概述MPEG VCM集团,包括使用案例、要求、处理管道、潜在VCM标准计划,随后由评价框架(即MPEG)所遵循的评估框架,包括最近推出的机器-CM技术解决方案、数据评估,然后我们讨论MEG的模型制作模型,然后讨论最新版本。