Objective: The use of deep learning for electroencephalography (EEG) classification tasks has been rapidly growing in the last years, yet its application has been limited by the relatively small size of EEG datasets. Data augmentation, which consists in artificially increasing the size of the dataset during training, can be employed to alleviate this problem. While a few augmentation transformations for EEG data have been proposed in the literature, their positive impact on performance is often evaluated on a single dataset and compared to one or two competing augmentation methods. This work proposes to better validate the existing data augmentation approaches through a unified and exhaustive analysis. Approach: We compare quantitatively 13 different augmentations with two different predictive tasks, datasets and models, using three different types of experiments. Main results: We demonstrate that employing the adequate data augmentations can bring up to 45% accuracy improvements in low data regimes compared to the same model trained without any augmentation. Our experiments also show that there is no single best augmentation strategy, as the good augmentations differ on each task. Significance: Our results highlight the best data augmentations to consider for sleep stage classification and motor imagery brain-computer interfaces. More broadly, it demonstrates that EEG classification tasks benefit from adequate data augmentation
翻译:目标:过去几年来,电脑造影(EEG)分类任务的深层学习使用量迅速增加,但其应用却因EEG数据集规模较小而受到限制。数据增强(包括培训期间人工增加数据集规模)可以用来缓解这一问题。虽然文献中提出了对电子脑造影(EEG)数据进行一些增强转换,但其对性能的积极影响往往由一个数据集来评价,与一两个相互竞争的增强方法相比较。这项工作提议通过统一和详尽的分析来更好地验证现有的数据增强方法。方法:我们用三种不同的实验将13种不同的增强方法与两种不同的预测任务、数据集和模型进行定量比较。主要结果:我们证明,使用适当的数据增强可以使低数据系统的精确度提高45%,而没有增强任何增强的模型则可以提高45%。我们的实验还表明,没有一种最佳的增强战略,因为每个任务都有不同的良好增强能力。 说服力:我们的结果突出了最佳的数据增强能力,可以用来考虑睡眠阶段分类和汽车图像脑-计算机界面的增强能力。