We develop a Synthetic Fusion Pyramid Network (SPF-Net) with a scale-aware loss function design for accurate crowd counting. Existing crowd-counting methods assume that the training annotation points were accurate and thus ignore the fact that noisy annotations can lead to large model-learning bias and counting error, especially for counting highly dense crowds that appear far away. To the best of our knowledge, this work is the first to properly handle such noise at multiple scales in end-to-end loss design and thus push the crowd counting state-of-the-art. We model the noise of crowd annotation points as a Gaussian and derive the crowd probability density map from the input image. We then approximate the joint distribution of crowd density maps with the full covariance of multiple scales and derive a low-rank approximation for tractability and efficient implementation. The derived scale-aware loss function is used to train the SPF-Net. We show that it outperforms various loss functions on four public datasets: UCF-QNRF, UCF CC 50, NWPU and ShanghaiTech A-B datasets. The proposed SPF-Net can accurately predict the locations of people in the crowd, despite training on noisy training annotations.
翻译:根据我们所知,我们开发了一个具有比例差损耗功能设计以准确计人数的合成聚合聚变粒子网(SPF-Net),其规模差值功能设计可以准确计算人群。现有的人群计数方法假定培训批注点准确,从而忽略了紧张的注释可能导致巨大的模型学习偏差和计数错误,特别是计算远处出现的高度密集的人群。据我们所知,这项工作是第一个在多个尺度上适当处理这种噪音,设计端到端损失,从而推动人群计数最新技术。我们把人群批注点的噪音作为高山模型,并从输入图像中得出人群密度图。然后我们将人群密度地图的分布与多个尺度的完全变量相近,并得出可移动性和高效执行的低级别近似值。由此产生的规模差功能被用来培训SPF-Net。我们显示,它超越了四个公共数据集(UCF-QNRF、UC CC 50、NWPU和上海科技 A-B数据网络)上的各种损失功能差功能,尽管对人群进行了精确的动态数据网络进行了预测。