More than two years after its outbreak, the COVID-19 pandemic continues to plague medical systems around the world, putting a strain on scarce resources, and claiming human lives. From the very beginning, various AI-based COVID-19 detection and monitoring tools have been pursued in an attempt to stem the tide of infections through timely diagnosis. In particular, computer audition has been suggested as a non-invasive, cost-efficient, and eco-friendly alternative for detecting COVID-19 infections through vocal sounds. However, like all AI methods, also computer audition is heavily dependent on the quantity and quality of available data, and large-scale COVID-19 sound datasets are difficult to acquire -- amongst other reasons -- due to the sensitive nature of such data. To that end, we introduce the COVYT dataset -- a novel COVID-19 dataset collected from public sources containing more than 8 hours of speech from 65 speakers. As compared to other existing COVID-19 sound datasets, the unique feature of the COVYT dataset is that it comprises both COVID-19 positive and negative samples from all 65 speakers. We analyse the acoustic manifestation of COVID-19 on the basis of these perfectly speaker characteristic balanced `in-the-wild' data using interpretable audio descriptors, and investigate several classification scenarios that shed light into proper partitioning strategies for a fair speech-based COVID-19 detection.
翻译:在疾病爆发两年多后,COVID-19大流行病继续肆虐世界各地的医疗系统,对稀缺资源造成压力,夺去人的生命。从一开始,便采用各种基于AI的COVID-19检测和监测工具,试图通过及时诊断遏制感染潮流。特别是,人们建议计算机试镜是一种通过声音探测COVID-19感染的非侵入性、成本效益高和生态友好的替代方法。然而,与所有AI方法一样,计算机试镜也严重依赖现有数据的数量和质量,而大规模COVID-19声音数据集由于这些数据的敏感性质而难以获取 -- -- 除其他原因外 -- -- 由于这些数据的敏感性。为此,我们引入了COVYT数据集 -- -- 这是从公共来源收集的新颖的COVID-19数据集,其中载有65位发言者8小时以上的演讲。与其他COVID-19声音数据集相比,COVYT数据集的独特特征是,它包括所有65位发言者的COVID-19-19级声音样本和负面样本。我们用精确的COVI-19-D模型分析这些精确的图像,并用精确的图像分析这些精确的图像分析,并用精确的图像分析了CVI-CD-Sqal-IS-Shal-Sha-S-Shal-S-Sha-S-Shal-Shal-shal-sc-sc-shal-sc-sc-sha-sc-sc-sc-sc-sc-sc-sha-sc-sc-sc-sc-sc-sc-sc-sc-sc-sc-sc-sc-sc-scal的模型的模型的模型的模型的特征的特征的精确的精确的特征,根据这些的精确的精确的特征分析了这些精确的图像的特征分析。