Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.
翻译:自2019年科罗纳病毒(COVID-19)大流行初期以来,人们一直有兴趣利用人工智能方法,根据声频信号,例如咳嗽记录,预测COVID-19感染状况,然而,现有研究在数据收集和评估拟议预测模型的性能方面有局限性,本文件利用联合王国卫生安全局收集的数据集,严格评估了以声频信号为基础预测COVID-19感染状况的最先进的机器学习技术。该数据集包括声频录音和广泛的研究参与者元数据。我们为测试根据声学特征对COVID-19感染状况进行分类的方法的性能提供了准则。我们讨论了如何将这些方法更广泛地推广到发展和评估基于公共卫生数据集的预测方法。