This paper addresses the problem of head detection in crowded environments. Our detection is based entirely on the geometric consistency across cameras with overlapping fields of view, and no additional learning process is required. We propose a fully unsupervised method for inferring scene and camera geometry, in contrast to existing algorithms which require specific calibration procedures. Moreover, we avoid relying on the presence of body parts other than heads or on background subtraction, which have limited effectiveness under heavy clutter. We cast the head detection problem as a stereo MRF-based optimization of a dense pedestrian height map, and we introduce a constraint which aligns the height gradient according to the vertical vanishing point direction. We validate the method in an outdoor setting with varying pedestrian density levels. With only three views, our approach is able to detect simultaneously tens of heavily occluded pedestrians across a large, homogeneous area.
翻译:本文探讨在拥挤环境中头部检测的问题。 我们的检测完全基于相距相距相距相距相距相距甚远的几何一致性, 不需要额外的学习过程。 我们建议一种完全不受监督的推断场景和相机几何方法, 与需要具体校准程序的现有算法相比。 此外, 我们避免依赖除头部或背景减色之外的身体部位的存在, 这些部位在重压下的效力有限。 我们将头部检测问题作为以立体MRF为基础的宽厚行人高度图优化, 我们引入了将高度梯度与垂直消失点方向相匹配的制约。 我们验证了在行人密度水平不同的户外环境使用的方法。 只有三种观点, 我们的方法能够同时探测大片、 均匀的行人。