Live video analytics (LVA) runs continuously across massive camera fleets, but inference cost with modern vision models remains high. To address this, dynamic model size selection (DMSS) is an attractive approach: it is content-aware but treats models as black boxes, and could potentially reduce cost by up to 10x without model retraining or modification. Without ground truth labels at runtime, we observe that DMSS methods use two stages per segment: (i) sampling a few models to calculate prediction statistics (e.g., confidences), then (ii) selection of the model size from those statistics. Prior systems fail to generalize to diverse workloads, particularly to mobile videos and lower accuracy targets. We identify that the failure modes stem from inefficient sampling whose cost exceeds its benefit, and inaccurate per-segment accuracy prediction. In this work, we present RedunCut, a new DMSS system that addresses both: It uses a measurement-driven planner that estimates the cost-benefit tradeoff of sampling, and a lightweight, data-driven performance model to improve accuracy prediction. Across road-vehicle, drone, and surveillance videos and multiple model families and tasks, RedunCut reduces compute cost by 14-62% at fixed accuracy and remains robust to limited historical data and to drift.
翻译:实时视频分析(LVA)在海量摄像头集群上持续运行,但现代视觉模型的推理成本仍然高昂。为此,动态模型规模选择(DMSS)是一种颇具吸引力的解决方案:该方法具备内容感知能力,但将模型视为黑箱,且无需模型重训练或修改即可将成本最高降低10倍。在运行时缺乏真实标签的情况下,我们观察到DMSS方法对每个视频片段采用两个阶段:(i)采样少量模型以计算预测统计量(如置信度),随后(ii)基于这些统计量选择模型规模。现有系统难以泛化至多样化工作负载,特别是移动视频和较低精度目标场景。我们发现其失效模式源于采样效率低下(其成本超过收益)以及逐片段精度预测不准。本研究提出RedunCut——一种新型DMSS系统,可同时解决这两个问题:该系统采用测量驱动的规划器来评估采样的成本效益权衡,并利用轻量级数据驱动的性能模型来提升精度预测能力。在道路车辆、无人机和监控视频数据集上,跨越多类模型家族与任务,RedunCut在固定精度下将计算成本降低14-62%,并对有限历史数据及数据漂移保持鲁棒性。