Videos take a lot of time to transport over the network, hence running analytics on the live video on embedded or mobile devices has become an important system driver. Considering that such devices, e.g., surveillance cameras or AR/VR gadgets, are resource constrained, creating lightweight deep neural networks (DNNs) for embedded devices is crucial. None of the current approximation techniques for object classification DNNs can adapt to changing runtime conditions, e.g., changes in resource availability on the device, the content characteristics, or requirements from the user. In this paper, we introduce ApproxNet, a video object classification system for embedded or mobile clients. It enables novel dynamic approximation techniques to achieve desired inference latency and accuracy trade-off under changing runtime conditions. It achieves this by enabling two approximation knobs within a single DNN model, rather than creating and maintaining an ensemble of models (e.g., MCDNN [MobiSys-16]. We show that ApproxNet can adapt seamlessly at runtime to these changes, provides low and stable latency for the image and video frame classification problems, and show the improvement in accuracy and latency over ResNet [CVPR-16], MCDNN [MobiSys-16], MobileNets [Google-17], NestDNN [MobiCom-18], and MSDNet [ICLR-18].
翻译:视频在网络上传输需要大量时间,因此在嵌入或移动设备上现场视频上运行分析器需要大量时间才能在网络上传输,因此,在嵌入或移动设备上运行实时视频分析器已成为一个重要的系统驱动器。考虑到这些设备,例如监视相机或AR/VR小工具,资源受到限制,为嵌入设备创建轻量深神经网络(DNNN)至关重要。目前用于目标分类DNNN的近距离技术没有一个能够适应不断变化的运行时间条件,例如设备资源可用性的变化、内容特点或用户的要求。我们在本文件中引入ApproxNet,这是一个用于嵌入或移动客户的视频物体分类系统。它使得新的动态近距离技术能够在不断变化的运行条件下实现所期望的推力延度和准确性交易。它通过在单个 DNNNNNN模型中启用两个近距离网络,而不是创建和维持一个模型的组合(例如,MCDNNNN[MoSy-16]、内容特性16]和用户的要求。我们显示ApproxNet可以在运行时无缝调整适应这些变化,为图像和图像框架的改进[NCSNV]的[低级和分辨率]的图像和图像和图像分类,提供低级的RBYSDR]。