Researchers have presented systems for efficiently analysing video data at scale using sampling algorithms. While these systems effectively leverage the temporal redundancy present in videos, they suffer from three limitations. First, they use traditional video storage formats are tailored for human consumption. Second, they load and decode the entire compressed video in memory before applying the sampling algorithm. Third, the sampling algorithms often require labeled training data obtained using a specific deep learning model. These limitations lead to lower accuracy, higher query execution time, and larger memory footprint. In this paper, we present EKO, a storage engine for efficiently managing video data. EKO relies on two optimizations. First, it uses a novel unsupervised, adaptive sampling algorithm for identifying the key frames in a given video. Second, it stores the identified key frames in a compressed representation that is optimized for machine consumption. We show that EKO improves F1-score by up to 9% compared to the next best performing state-of-the-art unsupervised, sampling algorithms by selecting more representative frames. It reduces query execution time by 3X and memory footprint by 10X in comparison to a widely-used, traditional video storage format.
翻译:研究人员提出了利用抽样算法有效分析规模视频数据的系统; 虽然这些系统有效地利用了视频中存在的时间冗余,但它们受到三个限制。 首先,他们使用传统的视频存储格式是适合人类消费的。 其次,他们先在记忆中装载和解码整个压缩视频,然后再应用抽样算法。 第三,抽样算法往往需要使用具体的深层次学习模式获得的标签培训数据。这些限制导致精确度较低、查询执行时间较高、记忆足迹较大。 在本文中,我们介绍了EKO,这是高效管理视频数据的存储引擎。 EKO依靠两种优化。 首先,它使用一种新型的未经监督的适应性取样算法来确定特定视频中的关键框架。 其次,它用一种最优化的压缩代表法存储了已确定的关键框架,用于机器消费。 我们表明,EKO通过选择更具代表性的框架,将最先进的状态、最先进的、最不受监督的取样算法提高了9%。 它将查询时间减少3X,并将记忆足迹减少10X,比广泛使用的传统视频存储格式减少10X。