There has been an increased interest in applying deep neural networks to automatically interpret and analyze the 12-lead electrocardiogram (ECG). The current paradigms with machine learning methods are often limited by the amount of labeled data. This phenomenon is particularly problematic for clinically-relevant data, where labeling at scale can be time-consuming and costly in terms of the specialized expertise and human effort required. Moreover, deep learning classifiers may be vulnerable to adversarial examples and perturbations, which could have catastrophic consequences, for example, when applied in the context of medical treatment, clinical trials, or insurance claims. In this paper, we propose a physiologically-inspired data augmentation method to improve performance and increase the robustness of heart disease detection based on ECG signals. We obtain augmented samples by perturbing the data distribution towards other classes along the geodesic in Wasserstein space. To better utilize domain-specific knowledge, we design a ground metric that recognizes the difference between ECG signals based on physiologically determined features. Learning from 12-lead ECG signals, our model is able to distinguish five categories of cardiac conditions. Our results demonstrate improvements in accuracy and robustness, reflecting the effectiveness of our data augmentation method.
翻译:运用深度神经网络自动解释和分析12个铅型心电图(ECG)的兴趣日益浓厚。目前使用机器学习方法的范例往往受到标签数据数量的限制。对于临床相关数据来说,这种现象特别成问题,因为规模标签在所需的专门知识和人力工作方面可能耗费时间和费用。此外,深层学习分类者可能易受对抗性实例和扰动的影响,例如,在医疗、临床试验或保险索赔中应用时,可能会产生灾难性后果。在本文件中,我们提议一种生理激励的数据增强方法,以提高性能和增强基于ECG信号的心脏病检测的稳健性。我们通过在瓦塞斯坦空间的地质学上对其它类别进行数据分配来获取更多的样本。为了更好地利用特定领域的知识,我们设计了一种地面测量标准,承认ECG信号之间在生理测定特征方面的差异。从12级ECG信号中学习,我们的模型能够区分五类心脏状况。我们的结果表明,我们的数据在精确性和稳健度方面得到了改进,反映了我们的数据的增强性能。