Recently, we have seen a rapid development of Deep Neural Network (DNN) based visual tracking solutions. Some trackers combine the DNN-based solutions with Discriminative Correlation Filters (DCF) to extract semantic features and successfully deliver the state-of-the-art tracking accuracy. However, these solutions are highly compute-intensive, which require long processing time, resulting unsecured real-time performance. To deliver both high accuracy and reliable real-time performance, we propose a novel tracker called SiamVGG. It combines a Convolutional Neural Network (CNN) backbone and a cross-correlation operator, and takes advantage of the features from exemplary images for more accurate object tracking. The architecture of SiamVGG is customized from VGG-16, with the parameters shared by both exemplary images and desired input video frames. We demonstrate the proposed SiamVGG on OTB-2013/50/100 and VOT 2015/2016/2017 datasets with the state-of-the-art accuracy while maintaining a decent real-time performance of 50 FPS running on a GTX 1080Ti. Our design can achieve 2% higher Expected Average Overlap (EAO) compared to the ECO and C-COT in VOT2017 Challenge.
翻译:最近,我们目睹了深神经网络(DNN)基于深神经网络(DNN)的视觉跟踪解决方案的快速发展。一些跟踪者将基于DNN的解决方案与差异性关联过滤器(DCF)相结合,以提取语义特征并成功提供最新跟踪准确性;然而,这些解决方案的计算密度很高,需要长时间处理,从而导致无保障实时性能。为了提供高准确性和可靠的实时性能,我们提议了一个名为SiamVGG的新型跟踪器。它将一个基于DNNN的动态神经网络主干线和一个交叉关系操作器结合起来,并利用从模范图像中的功能进行更精确的天体跟踪。SiamVGG-16对SVVGGG的架构进行了定制,同时使用模范图像和理想输入视频框架所共享的参数。我们展示了拟议的关于OTB-2013/50/100和VOT 2015/2016/2017的SO, 以及最新精确性能保持50FPS运行于GX 1080-Ti的正常实时性功能,并利用这些功能来进行更精确的功能。我们的SamVAUD17 将达到更高预期目标。