This paper presents a systematic investigation into the effectiveness of Self-Supervised Learning (SSL) methods for Electrocardiogram (ECG) arrhythmia detection. We begin by conducting a novel distribution analysis on three popular ECG-based arrhythmia datasets: PTB-XL, Chapman, and Ribeiro. To the best of our knowledge, our study is the first to quantify these distributions in this area. We then perform a comprehensive set of experiments using different augmentations and parameters to evaluate the effectiveness of various SSL methods, namely SimCRL, BYOL, and SwAV, for ECG representation learning, where we observe the best performance achieved by SwAV. Furthermore, our analysis shows that SSL methods achieve highly competitive results to those achieved by supervised state-of-the-art methods. To further assess the performance of these methods on both In-Distribution (ID) and Out-of-Distribution (OOD) ECG data, we conduct cross-dataset training and testing experiments. Our comprehensive experiments show almost identical results when comparing ID and OOD schemes, indicating that SSL techniques can learn highly effective representations that generalize well across different OOD datasets. This finding can have major implications for ECG-based arrhythmia detection. Lastly, to further analyze our results, we perform detailed per-disease studies on the performance of the SSL methods on the three datasets.
翻译:本文对自监督学习(SSL)方法在心电图(ECG)心律失常检测中的有效性进行了系统研究。我们首先对三个流行的基于ECG的心律失常数据集(PTB-XL,Chapman和Ribeiro)进行了新颖的分布分析。据我们所知,这是本领域中首次量化这些分布。然后,我们使用不同的增强和参数进行了全面的实验,评估了各种SSL方法的有效性,即SimCRL,BYOL和SwAV,用于ECG表征学习,其中我们观察到SwAV取得了最佳表现。此外,我们的分析表明,SSL方法实现了与监督状态-of-the-art方法相当的竞争结果。为进一步评估这些方法在内部分布(ID)和外部分布(OOD)ECG数据上的性能,我们进行了跨数据集训练和测试实验。我们的全面实验显示,比较ID和OOD方案时几乎相同的结果,表明SSL技术可以学习高度有效的表示,可以很好地推广到不同的OOD数据集中。这一发现可以对基于ECG的心律失常检测产生重大影响。最后,为进一步分析我们的结果,我们对SSL方法在三个数据集上对每种疾病的表现进行了详细研究。