To efficiently process visual data at scale, researchers have proposed two techniques for lowering the computational overhead associated with the underlying deep learning models. The first approach consists of leveraging a specialized, lightweight model to directly answer the query. The second approach focuses on filtering irrelevant frames using a lightweight model and processing the filtered frames using a heavyweight model. These techniques suffer from two limitations. With the first approach, the specialized model is unable to provide accurate results for hard-to-detect events. With the second approach, the system is unable to accelerate queries focusing on frequently occurring events as the filter is unable to eliminate a significant fraction of frames in the video. In this paper, we present THIA, a video analytics system for tackling these limitations. The design of THIA is centered around three techniques. First, instead of using a cascade of models, it uses a single object detection model with multiple exit points for short-circuiting the inference. This early inference technique allows it to support a range of throughput-accuracy tradeoffs. Second, it adopts a fine-grained approach to planning and processes different chunks of the video using different exit points to meet the user's requirements. Lastly, it uses a lightweight technique for directly estimating the exit point for a chunk to lower the optimization time. We empirically show that these techniques enable THIA to outperform two state-of-the-art video analytics systems by up to 6.5X, while providing accurate results even on queries focusing on hard-to-detect events.
翻译:为了高效处理规模的视觉数据,研究人员提出了两种降低与深层学习模型相关的计算间接费用的技术。第一种方法是利用一个专门、轻量模型直接回答询问。第二种方法是利用轻量模型过滤不相关框架,用重量模型处理过滤框架。这些技术有两种局限性。第一种方法是,专门模型无法为难以探测的事件提供准确的结果。第二种方法是,由于过滤器无法消除大量视频框架,因此系统无法加快对经常发生的事件的查询。在本文件中,我们介绍一个视频分析系统,即处理这些限制的视频分析系统。THIA的设计以三种技术为中心。首先,它使用一个具有多个导出点的单一物体探测模型来缩短推断。这种早期推导出技术使得它无法支持一系列经常发生的事件,因为过滤器无法消除大量视频框架中的相当部分。在本文中,我们介绍一个视频分析系统(THTIA),一个视频分析系统(THIA),一个视频分析系统(THIA)的视频分析系统,然后用不同的图像分析模型来直接估算不同退出的输出。首先,我们用这些图像分析系统来提供一个直径的图像分析系统,然后用这些图像分析系统,然后用不同的图像分析系统来直接分析结果,然后用不同的图像分析系统来显示一个直向不同的输出。