以基于空间时时变形器-制导传播为基础的数据增强促进基于高效电素的行动识别</s> (Spatial-temporal Transformer-guided Diffusion based Data Augmentation for Efficient Skeleton-based Action Recognition)

Recently, skeleton-based human action has become a hot research topic because the compact representation of human skeletons brings new blood to this research domain. As a result, researchers began to notice the importance of using RGB or other sensors to analyze human action by extracting skeleton information. Leveraging the rapid development of deep learning (DL), a significant number of skeleton-based human action approaches have been presented with fine-designed DL structures recently. However, a well-trained DL model always demands high-quality and sufficient data, which is hard to obtain without costing high expenses and human labor. In this paper, we introduce a novel data augmentation method for skeleton-based action recognition tasks, which can effectively generate high-quality and diverse sequential actions. In order to obtain natural and realistic action sequences, we propose denoising diffusion probabilistic models (DDPMs) that can generate a series of synthetic action sequences, and their generation process is precisely guided by a spatial-temporal transformer (ST-Trans). Experimental results show that our method outperforms the state-of-the-art (SOTA) motion generation approaches on different naturality and diversity metrics. It proves that its high-quality synthetic data can also be effectively deployed to existing action recognition models with significant performance improvement.

翻译：最近,人类基于骨骼的行动已成为一个热门的研究课题,因为人类骨骼的压缩代表为这一研究领域带来了新的血液。因此,研究人员开始注意到使用RGB或其他传感器通过提取骨骼信息来分析人类行动的重要性。利用迅速发展深层学习(DL),最近提出了大量基于骨骼的人类行动方法,并精心设计了DL结构。然而,训练有素的DL模式总是需要高质量和足够的数据,而这种数据在不花费高成本和人力的情况下很难获得。在本文件中,我们为基于骨骼的行动识别任务引入了一种新的数据增强方法,这可以有效地产生高质量和多样的相继行动。为了获得自然和现实的行动序列,我们提议将可产生一系列合成行动序列的传播概率模型(DDPMs)脱钩,其生成过程由空间-时空变器(ST-Transy)精确地指导。实验结果显示,我们的方法超越了基于骨骼行动的状态(SOTA)运动的生成方法,这可以有效地产生高质量和多样化的相继行动。我们建议,通过有效的合成质量的模型来证明它能够有效地改进现有的数据。</s>