In this study, we proposed a machine learning-based system to distinguish patients with COVID-19 from non-COVID-19 patients by analyzing only a single cough sound. Two different data sets were used, one accessible for the public and the other available on request. After combining the data sets, the features were obtained from the cough sounds using the mel-frequency cepstral coefficients (MFCCs) method, and then they were classified with seven different machine learning classifiers. To determine the optimum values of hyperparameters for MFCCs and classifiers, the leave-one-out cross-validation (LOO-CV) strategy was implemented. Based on the results, the k-nearest neighbors classifier based on the Euclidean distance (k-NN Euclidean) with the accuracy rate, sensitivity of COVID-19, sensitivity of non-COVID-19, F-measure, and area under the ROC curve (AUC) of 0.9833, 1.0000, 0.9720, 0.9799, and 0.9860, respectively, is more successful than other classifiers. Finally, the best and most effective features were determined for each classifier using the sequential forward selection (SFS) method. According to the results, the proposed system is excellent compared with similar studies in the literature and can be easily used in smartphones and facilitate the diagnosis of COVID-19 patients. In addition, since the used data set includes reflex and unconscious coughs, the results showed that conscious or unconscious coughing has no effect on the diagnosis of COVID-19 patients based on the cough sound.
翻译:在这项研究中,我们建议了一种基于机器学习的系统,通过分析单一咳嗽声来区分COVID-19患者和非COVID-19患者。使用了两种不同的数据集,一种供公众使用,另一种可应请求提供。在将数据集合并后,利用mel-频率阴部系数(MFCCs)方法从咳嗽声中获取了特征,然后将其分类为7种不同的机器学习分类。为了确定MFCCs和分类器的高分数计的最佳值,采用了一次性交叉校验(LOO-CV)战略。基于结果,基于Euclidean距离(k-NNEuclidean)的Knearest邻居分类器,以及精确率、COVID-19的敏感度、非COVI-19的敏感度和ROC曲线(AUC)为0.833、1.0000、0.9720、0.9729990和0.9860的超标值交叉校验(LOV-C-C),分别实施了比其他肝-CLO(LO-C-C-C-CV)的诊断结果。最后,使用最佳和最有效的分析方法,在S-S-SLI-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-