This paper presents a novel low-cost method to predict: i) the vascular age of a healthy young person, ii) whether or not a person is a smoker, using only the lead-I of the electrocardiogram (ECG). We begin by collecting (lead-I) ECG data from 42 healthy subjects (male, female, smoker, non-smoker) aged 18 to 30 years, using our custom-built low-cost single-lead ECG module, and anthropometric data, e.g., body mass index, smoking status, blood pressure etc. Under our proposed method, we first pre-process our dataset by denoising the ECG traces, followed by baseline drift removal, followed by z-score normalization. Next, we divide ECG traces into overlapping segments of five-second duration, which leads to a 145-fold increase in the size of the dataset. We then feed our dataset to a number of machine learning models, a 1D convolutional neural network, a multi-layer perceptron (MLP), and ResNet18 transfer learning model. For vascular ageing prediction problem, Random Forest method outperforms all other methods with an R2 score of 0.99, and mean squared error of 0.07. For the binary classification problem that aims to differentiate between a smoker and a non-smoker, XGBoost method stands out with an accuracy of 96.5%. Finally, for the 4-class classification problem that aims to differentiate between male smoker, female smoker, male non-smoker, and female non-smoker, MLP method achieves the best accuracy of 97.5%. This work is aligned with the sustainable development goals of the United Nations which aim to provide low-cost but quality healthcare solutions to the unprivileged population.
翻译:暂无翻译