OBJECTIVE: Our objective is to evaluate the possibility of using cough audio recordings (spontaneous or simulated) to detect sound patterns in people who are diagnosed with COVID-19. The research question that led our work was: what is the sensitivity and specificity of a machine learning based COVID-19 cough classifier, using RT-PCR tests as gold standard? SETTING: The audio samples that were collected for this study belong to individuals who were swabbed in the City of Buenos Aires in 20 public and 1 private facilities where RT-PCR studies were carried out on patients suspected of COVID, and 14 out-of-hospital isolation units for patients with confirmed COVID mild cases. The audios were collected through the Buenos Aires city government WhatsApp chatbot that was specifically designed to address citizen inquiries related to the coronavirus pandemic (COVID-19). PARTICIPANTS: The data collected corresponds to 2821 individuals who were swabbed in the City of Buenos Aires, between August 11 and December 2, 2020. Individuals were divided into 1409 that tested positive for COVID-19 and 1412 that tested negative. From this sample group, 52.6% of the individuals were female and 47.4% were male. 2.5% were between the age of 0 and 20 , 61.1% between the age of 21 and 40 , 30.3% between the age of 41 and 60 and 6.1% were over 61 years of age. RESULTS: Using the dataset of 2821 individuals our results showed that the neural network classifier was able to discriminate between the COVID-19 positive and the healthy coughs with an accuracy of 86%. This accuracy obtained during the training process was later tested and confirmed with a second dataset corresponding to 492 individuals.
翻译:目标:我们的目标是评估是否有可能使用咳嗽录音(自发或模拟),以检测被确诊为COVID-19的人的听觉模式。导致我们工作的研究问题是:一个机器学习基于COVID-19咳嗽分类器的敏感度和特殊性,以RT-PCR测试为金本标准?设置:为本研究收集的音频样本属于在布宜诺斯艾利斯市20个公共和1个私人设施中被抽取的个人,这些设施对怀疑患有COVID的病人进行了RT-PCR研究,对确诊为COVID年龄的病人进行了14个医院外隔离单元。导致我们工作的研究的问题是:一个机器学习基于COVID-19咳咳嗽分类器的COVID-19分类器的敏感度和特性如何?为本研究收集的音频是布宜诺斯艾利斯市政府专门设计用来回答与 Corona病毒流行(COVID-19)有关的公民问询的(COVID-19)。 第二阶段,收集的数据是2821个人被抽取的2821个人,在2020年8月11日至12日之间。