Audio classification using breath and cough samples has recently emerged as a low-cost, non-invasive, and accessible COVID-19 screening method. However, no application has been approved for official use at the time of writing due to the stringent reliability and accuracy requirements of the critical healthcare setting. To support the development of the Machine Learning classification models, we performed an extensive comparative investigation and ranking of 15 audio features, including less well-known ones. The results were verified on two independent COVID-19 sound datasets. By using the identified top-performing features, we have increased the COVID-19 classification accuracy by up to 17% on the Cambridge dataset, and up to 10% on the Coswara dataset, compared to the original baseline accuracy without our feature ranking.
翻译:使用呼吸和咳嗽样本的音频分类最近成为一种低成本、非侵入性和可获得的COVID-19筛选方法,但是,由于关键保健环境的严格可靠性和准确性要求,在撰写本报告时没有批准任何正式使用的申请。为了支持机器学习分类模型的开发,我们进行了广泛的比较调查,对15个音频特征进行了排名,包括不太广为人知的音频特征。结果在两个独立的COVID-19声音数据集中进行了核实。通过使用所查明的顶级功能,我们把剑桥数据集的COVID-19分类精确度提高了17%,而科斯瓦拉数据集的原始基线精确度则没有我们的特征排序,增加了10%。