Electrocardiogram (ECG) datasets tend to be highly imbalanced due to the scarcity of abnormal cases. Additionally, the use of real patients' ECG is highly regulated due to privacy issues. Therefore, there is always a need for more ECG data, especially for the training of automatic diagnosis machine learning models, which perform better when trained on a balanced dataset. We studied the synthetic ECG generation capability of 5 different models from the generative adversarial network (GAN) family and compared their performances, the focus being only on Normal cardiac cycles. Dynamic Time Warping (DTW), Fr\'echet, and Euclidean distance functions were employed to quantitatively measure performance. Five different methods for evaluating generated beats were proposed and applied. We also proposed 3 new concepts (threshold, accepted beat and productivity rate) and employed them along with the aforementioned methods as a systematic way for comparison between models. The results show that all the tested models can to an extent successfully mass-generate acceptable heartbeats with high similarity in morphological features, and potentially all of them can be used to augment imbalanced datasets. However, visual inspections of generated beats favor BiLSTM-DC GAN and WGAN, as they produce statistically more acceptable beats. Also, with regards to productivity rate, the Classic GAN is superior with a 72% productivity rate.
翻译:心电图(ECG)数据集往往由于异常病例稀少而高度失衡。此外,由于隐私问题,实际病人ECG的使用受到高度监管。因此,始终需要更多的ECG数据,特别是自动诊断机学习模型的培训,在经过均衡数据集培训后,这种模型效果更好。我们研究了与基因对抗网络(GAN)家庭5种不同的模型合成ECG生成能力,并比较了这些模型的性能,其重点仅集中在正常心脏周期。动态时间扭曲(DTW)、Fr\'echet和Euclidean远程功能被用于定量测量性能。提出了5种不同的评价节拍的方法,并应用了5种不同的评价方法。我们还提出了3个新概念(门槛、被接受的节拍和生产率),并使用上述方法作为系统比较模型的方法。结果显示,所有经过测试的模型都能够在一定程度上成功地大规模生成可接受的心跳,在形态特征上具有高度相似性能的心跳,而且所有这些功能都有可能被用来提高性能测量性能的性能性能。然而,对GANDC的视觉检查也有利于提高GAN的统计效率。