Conventional methods for human pose estimation either require a high degree of instrumentation, by relying on many inertial measurement units (IMUs), or constraint the recording space, by relying on extrinsic cameras. These deficits are tackled through the approach of human pose estimation from sparse IMU data. We define attention-oriented adjacency adaptive graph convolutional long-short term memory networks (A3GC-LSTM), to tackle human pose estimation based on six IMUs, through incorporating the human body graph structure directly into the network. The A3GC-LSTM combines both spatial and temporal dependency in a single network operation, more memory efficiently than previous approaches. The recurrent graph learning on arbitrarily long sequences is made possible by equipping graph convolutions with adjacency adaptivity, which eliminates the problem of information loss in deep or recurrent graph networks, while it also allows for learning unknown dependencies between the human body joints. To further boost accuracy, a spatial attention formalism is incorporated into the recurrent LSTM cell. With our presented approach, we are able to utilize the inherent graph nature of the human body, and thus can outperform the state of the art for human pose estimation from sparse IMU data.
翻译:人类表面估计的常规方法要么需要高度的仪器,依靠许多惯性测量单位(IMUs),要么依靠外部照相机限制记录空间。这些缺陷是通过利用稀薄的IMU数据进行人类表面估计的方法来解决的。我们定义了以关注为导向的相干图的相干适应性图形长期短期内存网络(A3GC-LSTM),以便通过将人体图象结构直接纳入网络,根据6个IMU(A3GC-LSTM)处理人类表面估计。A3GC-LSTM将空间和时间依赖合并在一个单一的网络操作中,比以前的方法更有效率的内存。通过配置图形变异性适应性来填补任意长序列的经常性图表学习,从而消除了深度或经常性图形网络中的信息损失问题,同时还能够了解人体连接之间的未知的依存性。为了进一步提高准确性,将空间注意形式主义纳入经常性的LSTMMTM细胞。我们提出的方法使我们能够利用人体的内在图形性质,从而从微量的状态上排除了人类结构的状态。