In recent years, crowd counting has become an important issue in computer vision. In most methods, the density maps are generated by convolving with a Gaussian kernel from the ground-truth dot maps which are marked around the center of human heads. Due to the fixed geometric structures in CNNs and indistinct head-scale information, the head features are obtained incompletely. Deformable convolution is proposed to exploit the scale-adaptive capabilities for CNN features in the heads. By learning the coordinate offsets of the sampling points, it is tractable to improve the ability to adjust the receptive field. However, the heads are not uniformly covered by the sampling points in the deformable convolution, resulting in loss of head information. To handle the non-uniformed sampling, an improved Normed-Deformable Convolution (\textit{i.e.,}NDConv) implemented by Normed-Deformable loss (\textit{i.e.,}NDloss) is proposed in this paper. The offsets of the sampling points which are constrained by NDloss tend to be more even. Then, the features in the heads are obtained more completely, leading to better performance. Especially, the proposed NDConv is a light-weight module which shares similar computation burden with Deformable Convolution. In the extensive experiments, our method outperforms state-of-the-art methods on ShanghaiTech A, ShanghaiTech B, UCF\_QNRF, and UCF\_CC\_50 dataset, achieving 61.4, 7.8, 91.2, and 167.2 MAE, respectively. The code is available at https://github.com/bingshuangzhuzi/NDConv
翻译:近几年来, 人群计数已成为计算机视觉中的一个重要问题。 在大多数方法中, 密度地图是由来自位于人头中心周围的地光点图上的高斯内核结合而生成的。 由于CNN的固定几何结构以及头级信息模糊不清, 头部特征是不完整的。 提议变形变异以利用CNN头部功能的缩放能力。 通过了解取样点的坐标偏差, 可以提高调控接收场的能力。 然而, 无法变形的混凝土图的取样点并不统一覆盖在云层图中, 导致头部信息丢失。 要处理非整形的取样, 改进了Normed- 变形变形( textit{i.e.}NDConvl) 的功能。 本文中提议了可变形损失(\ textitilit{i.e. dechndrlossl) 的缩略图 。 缩略图的缩略图因NDlevorational durational road而减弱, 的缩缩略图则则则则则在随后更接近。 。 缩缩化方法中建议了。 。