Recent sophisticated CNN-based algorithms have demonstrated their extraordinary ability to automate counting crowds from images, thanks to their structures which are designed to address the issue of various head scales. However, these complicated architectures also increase computational complexity enormously, making real-time estimation implausible. Thus, in this paper, a new method, based on Inception-V3, is proposed to reduce the amount of computation. This proposed approach (ICC), exploits the first five inception blocks and the contextual module designed in CAN to extract features at different receptive fields, thereby being context-aware. The employment of these two different strategies can also increase the model's robustness. Experiments show that ICC can at best reduce 85.3 percent calculations with 24.4 percent performance loss. This high efficiency contributes significantly to the deployment of crowd counting models in surveillance systems to guard the public safety. The code will be available at https://github.com/YIMINGMA/CrowdCounting-ICC,and its pre-trained weights on the Crowd Counting dataset, which comprises a large variety of scenes from surveillance perspectives, will also open-sourced.
翻译:CNN最近的精密算法显示,由于这些复杂结构的设计旨在解决各种头级问题,因此能够将人群从图像中自动计数。然而,这些复杂的结构也大大增加了计算的复杂性,使得实时估计无法令人相信。因此,在本文中,提议以“概念V3”为基础的新方法来减少计算量。这一拟议办法(ICC)利用了在CAN中设计的头五个起始区块和背景模块,以提取不同可接收域的特征,从而具有上下文意识。使用这两种不同的战略也可以提高模型的稳健性。实验显示,电算中心最多可以减少85.3%的计算率,造成24.4%的性能损失。这种高效率极大地有助于在监视系统中部署人群计数模型,以保障公共安全。该代码将在https://github.com/YIMINGMA/CrowdCounting-ICC上公布,以及其在Crowd Counting-ICC上预先训练的重量,该数据集包括从监视角度出发的大量场景,也将开放来源。