Data augmentation is a widely used technique in classification to increase data used in training. It improves generalization and reduces amount of annotated human activity data needed for training which reduces labour and time needed with the dataset. Sensor time-series data, unlike images, cannot be augmented by computationally simple transformation algorithms. State of the art models like Recurrent Generative Adversarial Networks (RGAN) are used to generate realistic synthetic data. In this paper, transformer based generative adversarial networks which have global attention on data, are compared on PAMAP2 and Real World Human Activity Recognition data sets with RGAN. The newer approach provides improvements in time and savings in computational resources needed for data augmentation than previous approach.
翻译:数据扩增是用来增加培训数据的一种广泛使用的分类技术,它改进了一般化,减少了培训所需的附加说明的人类活动数据的数量,从而减少了与数据集有关的劳动力和所需时间。与图像不同的是,传感器时间序列数据不能通过计算简单的转换算法加以补充。使用经常性基因对流网络等最新模型来生成现实的合成数据。本文比较了PAMAP2和真实世界人类活动识别数据集,以变压器为基础的、在全球关注数据的基因对冲网络,与RGAN比较了。较新的方法改进了数据扩增所需的计算资源的时间和节省。