Human pose forecasting is a complex structured-data sequence-modelling task, which has received increasing attention, also due to numerous potential applications. Research has mainly addressed the temporal dimension as time series and the interaction of human body joints with a kinematic tree or by a graph. This has decoupled the two aspects and leveraged progress from the relevant fields, but it has also limited the understanding of the complex structural joint spatio-temporal dynamics of the human pose. Here we propose a novel Space-Time-Separable Graph Convolutional Network (STS-GCN) for pose forecasting. For the first time, STS-GCN models the human pose dynamics only with a graph convolutional network (GCN), including the temporal evolution and the spatial joint interaction within a single-graph framework, which allows the cross-talk of motion and spatial correlations. Concurrently, STS-GCN is the first space-time-separable GCN: the space-time graph connectivity is factored into space and time affinity matrices, which bottlenecks the space-time cross-talk, while enabling full joint-joint and time-time correlations. Both affinity matrices are learnt end-to-end, which results in connections substantially deviating from the standard kinematic tree and the linear-time time series. In experimental evaluation on three complex, recent and large-scale benchmarks, Human3.6M [Ionescu et al. TPAMI'14], AMASS [Mahmood et al. ICCV'19] and 3DPW [Von Marcard et al. ECCV'18], STS-GCN outperforms the state-of-the-art, surpassing the current best technique [Mao et al. ECCV'20] by over 32% in average at the most difficult long-term predictions, while only requiring 1.7% of its parameters. We explain the results qualitatively and illustrate the graph interactions by the factored joint-joint and time-time learnt graph connections. Our source code is available at: https://github.com/FraLuca/STSGCN
翻译:人类表面预测是一项复杂的结构化数据序列模型任务,受到越来越多的关注,这也是许多潜在应用的缘故。研究主要涉及时间序列的时间层面,以及人体身体联合与运动树或图形的相互作用。这分解了这两个方面,利用了相关领域的进展,但也限制了对复杂的结构联合空间-时间动态的理解。在这里,我们提议建立一个新的空间-时间分离的变异图形网络(STS-GCN),以作出预测。第一次,STS-GCN模型将人类的变异动态作为时间序列以及人体与运动之间的相互作用,包括时间进化和空间联合互动,从而可以对运动和空间相关性进行交叉讨论。与此同时,STS-GCN是第一个空间-时间分解的GCN:空间-时间图连接仅被空间-时间-时间流变异变(STS-GCN) 和时间变异变变异变模型,它阻碍着空间-时间的变异连接关系,同时,在数字-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-