In surveillance and search and rescue applications, it is important to perform multi-target tracking (MOT) in real-time on low-end devices. Today's MOT solutions employ deep neural networks, which tend to have high computation complexity. Recognizing the effects of frame sizes on tracking performance, we propose DeepScale, a model agnostic frame size selection approach that operates on top of existing fully convolutional network-based trackers to accelerate tracking throughput. In the training stage, we incorporate detectability scores into a one-shot tracker architecture so that DeepScale can learn representation estimations for different frame sizes in a self-supervised manner. During inference, it can adapt frame sizes according to the complexity of visual contents based on user-controlled parameters. To leverage computation resources on edge servers, we propose two computation partition schemes tailored for MOT, namely, edge server only with adaptive frame-size transmission and edge server-assisted tracking. Extensive experiments and benchmark tests on MOT datasets demonstrate the effectiveness and flexibility of DeepScale. Compared to a state-of-the-art tracker, DeepScale++, a variant of DeepScale achieves 1.57X accelerated with only moderate degradation ~2.3\ in tracking accuracy on the MOT15 dataset in one configuration. We have implemented and evaluated DeepScale++ and the proposed computation partition schemes on a small-scale testbed consisting of an NVIDIA Jetson TX2 board and a GPU server. The experiments reveal non-trivial trade-offs between tracking performance and latency compared to server-only or smart camera-only solutions.
翻译:在监视、搜索和救援应用中,必须在低端装置上实时进行多目标跟踪(MOT)非常重要。今天的MOT解决方案使用深神经网络,这些网络往往具有很高的计算复杂性。认识到框架大小对跟踪性能的影响,我们提议了DeepSeal,这是一个模型不可知框架大小的选择方法,在现有的全演动网络跟踪器之上运作,以加快跟踪传输量。在培训阶段,我们将可探测性分数纳入一个一发追踪器结构,以便深层系统能够以不受监督的方式学习不同框架大小的表示估计。在推断期间,它可以根据基于用户控制的参数的视觉内容的复杂性调整框架大小。为了在边缘服务器上利用计算资源,我们提议了两种为MOT设计的计算分配网尺寸模型,即仅具备适应性框架大小传输和边缘服务器辅助跟踪跟踪功能的边端服务器。在深度分析中,深度2x级服务器的运行率比值比比值为最高级跟踪器,在深度的服务器运行轨道上,在深度分析中,在深度分析中,以1x级的深度分析机头的精确度进行快速分析。