Ground pressure exerted by the human body is a valuable source of information for human activity recognition (HAR) in unobtrusive pervasive sensing. While data collection from pressure sensors to develop HAR solutions requires significant resources and effort, we present a novel end-to-end framework, PresSim, to synthesize sensor data from videos of human activities to reduce such effort significantly. PresSim adopts a 3-stage process: first, extract the 3D activity information from videos with computer vision architectures; then simulate the floor mesh deformation profiles based on the 3D activity information and gravity-included physics simulation; lastly, generate the simulated pressure sensor data with deep learning models. We explored two approaches for the 3D activity information: inverse kinematics with mesh re-targeting, and volumetric pose and shape estimation. We validated PresSim with an experimental setup with a monocular camera to provide input and a pressure-sensing fitness mat (80x28 spatial resolution) to provide the sensor ground truth, where nine participants performed a set of predefined yoga sequences.
翻译:人体的地面压力是人类活动识别(HAR)无侵扰性普遍感测中的重要信息源。虽然从压力传感器收集数据以开发HAR解决方案需要大量资源和努力,但我们提出了一个新的端到端框架PresSim,以综合人类活动视频中的传感器数据以大大减少这种努力。PresSim采用一个三阶段过程:首先,从带有计算机视觉结构的视频中提取3D活动信息;然后根据3D活动信息和重力包含的物理模拟模拟模拟模拟地表网格变形剖面;最后,以深学习模型生成模拟压力传感器数据。我们探讨了3D活动信息的两个方法:反动向的重新瞄准,以及体积形状和形状估计。我们验证了PresSim的实验装置,用单镜相机提供输入和压力传感器健身垫子(80x28空间分辨率),以提供感应感测地面真相,其中9名参与者进行了一套预先定义的瑜伽序列。