Although single object trackers have achieved advanced performance, their large-scale network models make it difficult to apply them on the platforms with limited resources. Moreover, existing lightweight trackers only achieve balance between 2-3 points in terms of parameters, performance, Flops and FPS. To achieve the balance among all 4 points, this paper propose a lightweight full-convolutional Siamese tracker called lightFC. LightFC employs a noval efficient cross-correlation module (ECM) and a noval efficient rep-center head (ERH) to enhance the nonlinear expressiveness of the convoluational tracking pipeline. The ECM adopts an architecture of attention-like module and fuses local spatial and channel features from the pixel-wise correlation fusion features and enhance model nonlinearity with an inversion activation block. Additionally, skip-connections and the reuse of search area features are introduced by the ECM to improve its performance. The ERH reasonably introduces reparameterization technology and channel attention to enhance the nonlinear expressiveness of the center head. Comprehensive experiments show that LightFC achieves a good balance between performance, parameters, Flops and FPS. The precision score of LightFC outperforms MixFormerV2-S by 3.7 \% and 6.5 \% on LaSOT and TNL2K, respectively, while using 5x fewer parameters and 4.6x fewer Flops. Besides, LightFC runs 2x faster than MixFormerV2-S on CPUs. Our code and raw results can be found at https://github.com/LiYunfengLYF/LightFC
翻译:暂无翻译