Crowd counting is a challenging problem due to the scene complexity and scale variation. Although deep learning has achieved great improvement in crowd counting, scene complexity affects the judgement of these methods and they usually regard some objects as people mistakenly; causing potentially enormous errors in the crowd counting result. To address the problem, we propose a novel end-to-end model called Crowd Attention Convolutional Neural Network (CAT-CNN). Our CAT-CNN can adaptively assess the importance of a human head at each pixel location by automatically encoding a confidence map. With the guidance of the confidence map, the position of human head in estimated density map gets more attention to encode the final density map, which can avoid enormous misjudgements effectively. The crowd count can be obtained by integrating the final density map. To encode a highly refined density map, the total crowd count of each image is classified in a designed classification task and we first explicitly map the prior of the population-level category to feature maps. To verify the efficiency of our proposed method, extensive experiments are conducted on three highly challenging datasets. Results establish the superiority of our method over many state-of-the-art methods.
翻译:人群计数是一个具有挑战性的问题,因为现场复杂和规模差异。虽然深层次的学习在人群计数方面取得了很大的改进,但场面复杂影响到了对这种方法的判断,通常将某些对象误认为人;在人群计数结果中造成潜在的巨大错误。为了解决这个问题,我们提出了一个新的端到端模型,名为“人群注意力集中神经神经网络(CAT-CNN ) 。我们的CAT-CNN 可以通过自动编码信任地图来适应性地评估人类头在每一个像素位置的重要性。在信任地图的指导下,估计密度地图中的人头位置得到更多的注意编码最终密度地图,这可以有效避免巨大的误判。通过整合最终密度地图可以取得人群计数。要编集高度精细的密度地图,每张图像的人群计数在设计分类任务中分类,我们首先明确绘制人口级地图的先前位置。为了核实我们拟议方法的效率,在三个极具挑战性的数据集上进行了广泛的实验。结果确定我们的方法优于许多州级方法。