We introduce ECG-Image-Kit, an open-source toolbox for generating synthetic ECG images with realistic artifacts from time-series data, and showcase its application in developing algorithms for data augmentation and ECG image digitization. Synthetic data is generated by producing distortionless ECG images on a standard ECG paper background. Subsequently, various distortions, including handwritten text artifacts, wrinkles, creases, and perspective transformations, are applied to these ECG images. The artifacts and text are synthetically generated, excluding personally identifiable information. The toolbox is used for data augmentation in the 2024 PhysioNet Challenge on Digitization and Classification of ECG Images. As a case study, we employed ECG-Image-Kit to create an ECG image dataset of 21,801 records from the PhysioNet QT database. A denoising convolutional neural network (DnCNN)-based model was developed and trained on this synthetic dataset and used to convert the synthetically generated images back into time-series data for evaluation. SNR was calculated to assess the quality of image digitization compared to the ground truth ECG time-series. The results show an average signal recovery SNR of 11.17 +/- 9.19 dB, indicating the synthetic ECG image dataset's significance for training deep learning models. For clinical evaluation, we measured the error between the estimated and ground-truth time-series data's RR and QT-intervals. The accuracy of the estimated RR and QT-intervals also suggests that the respective clinical parameters are maintained. These results demonstrate the effectiveness of a deep learning-based pipeline in accurately digitizing paper ECGs and highlight a generative approach to digitization.
翻译:暂无翻译