Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digital stethoscopes) for cardiovascular or respiratory examination, which could then be used for automatic analysis. Some initial work shows promise in detecting diagnostic signals of COVID-19 from voice and coughs. In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. We use coughs and breathing to understand how discernible COVID-19 sounds are from those in asthma or healthy controls. Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds. We also show how we distinguish a user who tested positive for COVID-19 and has a cough from a healthy user with a cough, and users who tested positive for COVID-19 and have a cough from users with asthma and a cough. Our models achieve an AUC of above 80% across all tasks. These results are preliminary and only scratch the surface of the potential of this type of data and audio-based machine learning. This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals to aid COVID-19 diagnosis.
翻译:临床医生经常使用人体(如叹息、呼吸、心脏、消化、振动等)产生的听音信号,作为诊断疾病或评估疾病演变的指标。直到最近,这些信号通常在预定的访问中通过人工隔热方式收集。研究现已开始使用数字技术收集身体声音(如数字听诊器),用于心血管或呼吸检查,然后可用于自动分析。一些初步工作显示,在检测COVID-19的诊断信号和声音和咳嗽方面,有希望。我们在本文件中描述了我们对为帮助诊断COVID-19而收集的大规模呼吸道声音人群数据集的数据分析。我们用咳嗽和呼吸道来了解COVI-19的明显声音是如何从哮喘或健康控制中产生的。我们的研究结果显示,即使是简单的二手机器学习分类也能将正确的健康和COVID-19的听觉进行分类。我们发现,一个对COVID-1919的检测呈阳性的用户进一步咳嗽,而一个为COVI-19的心脏-19的诊断信号的用户们也只能用这种分析AVI-19的心脏-100型的初步数据。我们只用在80个地面用户身上进行。