Performing analytics tasks over large-scale video datasets is increasingly common in a wide range of applications. These tasks generally involve object detection and tracking operations that require applying expensive machine learning models, and several systems have recently been proposed to optimize the execution of video queries to reduce their cost. However, prior work generally optimizes execution speed in only one dimension, focusing on one optimization technique while ignoring other potential avenues for accelerating execution, thereby delivering an unsatisfactory tradeoff between speed and accuracy. We propose MultiScope, a general-purpose video pre-processor for object detection and tracking that explores multiple avenues for optimizing video queries to extract tracks from video with a superior tradeoff between speed and accuracy over prior work. We compare MultiScope against three recent systems on seven diverse datasets, and find that it provides a 2.9x average speedup over the next best baseline at the same accuracy level.
翻译:大规模视频数据集分析任务在各种应用中日益常见,这些任务一般涉及物体探测和跟踪操作,需要应用昂贵的机器学习模型,最近还提议采用若干系统优化视频查询的执行,以减少费用,但以往的工作通常只在一个方面优化执行速度,侧重于一个优化技术,而忽视加快执行的其他潜在途径,从而在速度和准确性之间实现不令人满意的权衡。我们提议多功能视频预处理器,即用于物体探测和跟踪的通用视频预处理器,探索优化视频查询的多种途径,以便从视频查询中提取速度和准确性优于以往工作的音轨。我们将多功能与七个不同数据集的三个最新系统进行比较,发现它为同一精确水平的下一个最佳基线提供了2.9x的平均速度。