The COVID-19 pandemic has challenged scientists and policy-makers internationally to develop novel approaches to public health policy. Furthermore, it has also been observed that the prevalence and spread of COVID-19 vary across different spatial, temporal, and demographics. Despite ramping up testing, we still are not at the required level in most parts of the globe. Therefore, we utilize self-reported symptoms survey data to understand trends in the spread of COVID-19. The aim of this study is to segment populations that are highly susceptible. In order to understand such populations, we perform exploratory data analysis, outbreak prediction, and time-series forecasting using public health and policy datasets. From our studies, we try to predict the likely % of the population that tested positive for COVID-19 based on self-reported symptoms. Our findings reaffirm the predictive value of symptoms, such as anosmia and ageusia. And we forecast that % of the population having COVID-19-like illness (CLI) and those tested positive as 0.15% and 1.14% absolute error respectively. These findings could help aid faster development of the public health policy, particularly in areas with low levels of testing and having a greater reliance on self-reported symptoms. Our analysis sheds light on identifying clinical attributes of interest across different demographics. We also provide insights into the effects of various policy enactments on COVID-19 prevalence.
翻译:COVID-19大流行要求科学家和决策者在国际上制定公共卫生政策的新办法,此外,还观察到COVID-19的流行和扩散在不同空间、时间和人口分布上各不相同。尽管测试速度加快,但我们在全球大部分地区仍没有达到所需的水平。因此,我们利用自我报告的症状调查数据来了解COVID-19扩散趋势。本研究的目的是了解高度易感染的人群。为了了解这些人群,我们利用公共卫生和政策数据集进行探索性数据分析、爆发预测和时序预测。我们的研究发现,根据自我报告的症状,我们试图预测对COVID-19测试呈阳性的人口可能占人口的百分比。我们的调查结果再次证实了症状的预测价值,例如骨质和年龄。我们预测,有COVID-19类似疾病(CLI)和检测呈阳性的人口比例分别为0.15%和1.14%的绝对误差。这些结果有助于加快公共卫生政策的制定,特别是在我们进行自我报告症状低水平的临床分析的地区内。我们更依赖各种人口动态分析。我们更低的深度的临床分析,还提供对不同程度的自我分析。