This study examines the relationship between H.264 video compression and the performance of an object detection network (YOLOv5). We curated a set of 50 surveillance videos and annotated targets of interest (people, bikes, and vehicles). Videos were encoded at 5 quality levels using Constant Rate Factor (CRF) values in the set {22,32,37,42,47}. YOLOv5 was applied to compressed videos and detection performance was analyzed at each CRF level. Test results indicate that the detection performance is generally robust to moderate levels of compression; using a CRF value of 37 instead of 22 leads to significantly reduced bitrates/file sizes without adversely affecting detection performance. However, detection performance degrades appreciably at higher compression levels, especially in complex scenes with poor lighting and fast-moving targets. Finally, retraining YOLOv5 on compressed imagery gives up to a 1% improvement in F1 score when applied to highly compressed footage.
翻译:本研究审查了H.264视频压缩与物体探测网络(YOLOv5)的性能之间的关系,我们制作了一套50个监控视频和附加说明的受关注目标(人、自行车和车辆);录象用固定速率系数值编码为{22,32,37,42,47}的5个质量级别;YOLOv5用于压缩视频,对每个通用报告格式水平的检测性能进行了分析;测试结果显示,检测性能一般强至中度的压缩水平;使用通用报告格式值37,而不是22,导致大大降低比特率/文件尺寸,而不会对检测性能产生不利影响;然而,检测性能在较高的压缩水平上明显降低,特别是在照明差和移动速度目标较快的复杂地区。最后,对压缩图像的YOLOv5进行再培训,在应用高度压缩的片段时,对F1评分提高1%。