Models of human motion commonly focus either on trajectory prediction or action classification but rarely both. The marked heterogeneity and intricate compositionality of human motion render each task vulnerable to the data degradation and distributional shift common to real-world scenarios. A sufficiently expressive generative model of action could in theory enable data conditioning and distributional resilience within a unified framework applicable to both tasks. Here we propose a novel architecture based on hierarchical variational autoencoders and deep graph convolutional neural networks for generating a holistic model of action over multiple time-scales. We show this Hierarchical Graph-convolutional Variational Autoencoder (HG-VAE) to be capable of generating coherent actions, detecting out-of-distribution data, and imputing missing data by gradient ascent on the model's posterior. Trained and evaluated on H3.6M and the largest collection of open source human motion data, AMASS, we show HG-VAE can facilitate downstream discriminative learning better than baseline models.
翻译:人类运动的模型通常侧重于轨迹预测或行动分类,但很少同时注重两者。人类运动的明显异质性和复杂构成使得每一项任务都易受数据降解和分布变化的影响,而这种数据降解和分布变化是现实世界的情景所共有的。一个足够清晰的基因化行动模式理论上可以在适用于两种任务的统一框架内使数据调节和分布复原力成为可能。在这里,我们提出一个基于等级差异性自动转换器和深图层相向神经网络的新结构,以便在多个时间尺度上形成一个整体行动模型。我们显示,这种等级式的图形-横向动态自动变异器(HG-VAE)能够产生一致的行动,探测分布数据,并以梯度作为模型后座的亮点来估计缺失的数据。我们用H3.6M来训练和评价,以及最大的公开源人类运动数据收集,AMASS,我们显示,HG-VAE能够促进比基线模型更好的下游歧视性学习。