Temporal action segmentation (TAS) aims to classify and locate actions in the long untrimmed action sequence. With the success of deep learning, many deep models for action segmentation have emerged. However, few-shot TAS is still a challenging problem. This study proposes an efficient framework for the few-shot skeleton-based TAS, including a data augmentation method and an improved model. The data augmentation approach based on motion interpolation is presented here to solve the problem of insufficient data, and can increase the number of samples significantly by synthesizing action sequences. Besides, we concatenate a Connectionist Temporal Classification (CTC) layer with a network designed for skeleton-based TAS to obtain an optimized model. Leveraging CTC can enhance the temporal alignment between prediction and ground truth and further improve the segment-wise metrics of segmentation results. Extensive experiments on both public and self-constructed datasets, including two small-scale datasets and one large-scale dataset, show the effectiveness of two proposed methods in improving the performance of the few-shot skeleton-based TAS task.
翻译:时间行动分解(TAS)旨在对长期未断的动作序列中的行动进行分类和定位。随着深层次学习的成功,出现了许多深层次的行动分解模式。然而,少见的TAS仍是一个具有挑战性的问题。本研究为以几发骨骼为基础的TAS提出了一个有效的框架,包括数据增强方法和改进的模型。基于运动间插的数据增强方法在此提出,以解决数据不足的问题,并可通过综合行动序列而大大增加样本的数量。此外,我们将连接时间分类(CTC)层与为基于骨骼的TAS设计的网络相融合,以获得一个优化的模型。利用CTS可以加强预测和地面真理之间的时间协调,并进一步改进分解结果的分层衡量标准。关于公共和自建数据集的广泛实验,包括两个小规模的数据集和一个大型数据集,表明两种拟议方法在改进以碎片为基础的TAS任务的绩效方面的有效性。