We present a method for training neural networks with synthetic electrocardiograms that mimic signals produced by a wearable single lead electrocardiogram monitor. We use domain randomization where the synthetic signal properties such as the waveform shape, RR-intervals and noise are varied for every training example. Models trained with synthetic data are compared to their counterparts trained with real data. Detection of r-waves in electrocardiograms recorded during different physical activities and in atrial fibrillation is used to compare the models. By allowing the randomization to increase beyond what is typically observed in the real-world data the performance is on par or superseding the performance of networks trained with real data. Experiments show robust performance with different seeds and training examples on different test sets without any test set specific tuning. The method makes possible to train neural networks using practically free-to-collect data with accurate labels without the need for manual annotations and it opens up the possibility of extending the use of synthetic data on cardiac disease classification when disease specific a priori information is used in the electrocardiogram generation. Additionally the distribution of data can be controlled eliminating class imbalances that are typically observed in health related data and additionally the generated data is inherently private.
翻译:我们提出一种方法来培训具有合成心电图的神经网络,这种神经网络可以模仿由磨损的单一铅电心电图监测器产生的信号。我们在每个培训实例中,使用合成信号特性,例如波形形状、RR-间隔和噪音各不相同的域随机化方法;用合成数据培训的模型可以与经过实际数据培训的对应人员进行比较;在不同物理活动和工地纤维化过程中记录的电子心电图中的R波的探测用于比较模型。允许随机化增加超出现实世界数据中通常观察到的信号;性能处于等同状态,或超过经过实际数据培训的网络的性能。实验显示不同种子的可靠性能,不同测试机组的培训实例没有具体测试设置的调整;该方法使得有可能在无需手动说明的情况下,使用带有准确标签的神经网络来培训神经网络;它为在电子心电图生成中使用特定疾病先前信息时扩大合成数据的使用提供了可能性。此外,数据的分配可以控制与个人健康有关的内部数据。