Single-photon sensitive depth sensors are being increasingly used in next-generation electronics for human pose and gesture recognition. However, cost-effective sensors typically have a low spatial resolution, restricting their use to basic motion identification and simple object detection. Here we perform a temporal to spatial mapping that drastically increases the resolution of a simple time-of-flight sensor, i.e.~an initial resolution of 4$\times$4 pixels to depth images of resolution 32$\times$32 pixels. The output depth maps can then be used for accurate three-dimensional human pose estimation of multiple people. We develop a new explainable framework that provides intuition to how our network utilizes its input data and provides key information about the relevant parameters. Our work greatly expands the use cases of simple SPAD time-of-flight sensors and opens up promising possibilities for future super-resolution techniques applied to other types of sensors with similar data types, i.e. radar and sonar.
翻译:在下一代电子设备中,对单磷敏感的深度传感器正越来越多地用于人类的姿势和姿态识别,然而,具有成本效益的传感器通常具有低空间分辨率,其使用仅限于基本运动识别和简单的物体探测。我们在这里对空间绘图进行时间到空间测绘,大大提高了简单的飞行时间传感器的分辨率,即:4美元的初步分辨率4比素用于第32号决议的深度图像。然后,输出深度地图可用于准确估计多人的三维人姿势。我们开发了新的可解释框架,为我们的网络如何利用其输入数据提供直觉,并提供有关参数的关键信息。我们的工作极大地扩大了简单的SPAD飞行时间传感器的使用范围,并为未来适用于类似数据类型的其他类型的传感器,即雷达和声纳的超分辨率技术开辟了有希望的可能性。