Video-based gait recognition has achieved impressive results in constrained scenarios. However, visual cameras neglect human 3D structure information, which limits the feasibility of gait recognition in the 3D wild world. Instead of extracting gait features from images, this work explores precise 3D gait features from point clouds and proposes a simple yet efficient 3D gait recognition framework, termed LidarGait. Our proposed approach projects sparse point clouds into depth maps to learn the representations with 3D geometry information, which outperforms existing point-wise and camera-based methods by a significant margin. Due to the lack of point cloud datasets, we built the first large-scale LiDAR-based gait recognition dataset, SUSTech1K, collected by a LiDAR sensor and an RGB camera. The dataset contains 25,239 sequences from 1,050 subjects and covers many variations, including visibility, views, occlusions, clothing, carrying, and scenes. Extensive experiments show that (1) 3D structure information serves as a significant feature for gait recognition. (2) LidarGait outperforms existing point-based and silhouette-based methods by a significant margin, while it also offers stable cross-view results. (3) The LiDAR sensor is superior to the RGB camera for gait recognition in the outdoor environment. The source code and dataset have been made available at https://lidargait.github.io.
翻译:视频图像进行步态识别在受限场景下已有很大突破。然而,视觉相机忽略了人体的三维结构信息,这限制了在三维野外环境中进行步态识别的可行性。本文不再从图像中提取步态特征,而是从点云中探索精确的三维步态特征,并提出了一个简单而高效的三维步态识别框架,称为 LidarGait。我们提出的方法将稀疏的点云投影到深度图中,以学习带有三维几何信息的表示,相对于现有的基于点云和基于相机的方法,取得了显著的优势。由于缺乏点云数据集,我们建立了第一个大规模基于激光雷达进行步态识别的数据集—SUSTech1K,通过激光雷达传感器和 RGB 相机收集。数据集包含 1,050 个不同主体的 25,239 个序列,并涵盖了许多变量,包括能见度、视角、遮挡、服装、携带和场景。大量的实验表明:(1) 三维结构信息是步态识别的重要特征。(2) LidarGait 相对于现有的基于点云和基于轮廓的方法具有显著的优势,并提供稳定的跨视角结果。(3) 激光雷达在户外环境下进行步态识别优于 RGB 相机。源代码和数据集已在 https://lidargait.github.io 上提供。