Perceiving humans in the context of Intelligent Transportation Systems (ITS) often relies on multiple cameras or expensive LiDAR sensors. In this work, we present a new cost-effective vision-based method that perceives humans' locations in 3D and their body orientation from a single image. We address the challenges related to the ill-posed monocular 3D tasks by proposing a neural network architecture that predicts confidence intervals in contrast to point estimates. Our neural network estimates human 3D body locations and their orientation with a measure of uncertainty. Our proposed solution (i) is privacy-safe, (ii) works with any fixed or moving cameras, and (iii) does not rely on ground plane estimation. We demonstrate the performance of our method with respect to three applications: locating humans in 3D, detecting social interactions, and verifying the compliance of recent safety measures due to the COVID-19 outbreak. We show that it is possible to rethink the concept of "social distancing" as a form of social interaction in contrast to a simple location-based rule. We publicly share the source code towards an open science mission.
翻译:智能运输系统(ITS)背景下的感知人类往往依赖于多摄像头或昂贵的LIDAR传感器。在这项工作中,我们提出了一种新的成本效益高的视觉方法,从单一图像中看待3D中的人的位置及其身体方向。我们通过提出一个神经网络结构来应对与错误的单眼3D任务有关的挑战,该结构预测与点估计不同的信任间隔。我们的神经网络估计人3D身体的位置及其方向,并有一定程度的不确定性。我们提议的解决方案(一)是保密的,(二)与任何固定或移动的相机合作,(三)不依赖地面飞机估计。我们展示了我们三种应用方法的绩效:3D中的人的位置,发现社会互动,核查由于COVID-19爆发而最近安全措施的遵守情况。我们表明,有可能重新思考“社会不和”的概念,作为社会互动的一种形式,与简单的基于地点的规则相对。我们公开分享源代码,以开放科学任务为目的。