We present a multi-camera 3D pedestrian detection method that does not need to train using data from the target scene. We estimate pedestrian location on the ground plane using a novel heuristic based on human body poses and person's bounding boxes from an off-the-shelf monocular detector. We then project these locations onto the world ground plane and fuse them with a new formulation of a clique cover problem. We also propose an optional step for exploiting pedestrian appearance during fusion by using a domain-generalizable person re-identification model. We evaluated the proposed approach on the challenging WILDTRACK dataset. It obtained a MODA of 0.569 and an F-score of 0.78, superior to state-of-the-art generalizable detection techniques.
翻译:我们提出了一个多镜头 3D行人探测方法,不需要使用目标场景的数据进行培训。我们使用基于人体姿势和个人从现成的单筒探测器中捆绑的盒子的新奇的黑素学来估计地面飞机上行人的位置。然后我们将这些位置投射到世界地面飞机上,并把它们与一个新的组合覆盖问题组合起来。我们还提议了一个可选步骤,用一个可覆盖域的人再识别模型来利用混凝过程中的行人外观。我们评估了挑战性WILDTRACK数据集的拟议方法。它获得了0.569的MSDA和0.78的F芯,优于最先进的通用探测技术。