The timely identification of socio-economic sectors vulnerable to a disease outbreak presents an important challenge to the civic authorities and healthcare workers interested in outbreak mitigation measures. This problem was traditionally solved by studying the aberrances in small-scale healthcare data. In this paper, we leverage data driven models to determine the relationship between the trends of World Development Indicators and occurrence of disease outbreaks using worldwide historical data from 2000-2019, and treat it as a classic supervised classification problem. CART based feature selection was employed in an unorthodox fashion to determine the covariates getting affected by the disease outbreak, thus giving the most vulnerable sectors. The result involves a comprehensive analysis of different classification algorithms and is indicative of the relationship between the disease outbreak occurrence and the magnitudes of various development indicators.
翻译:及时查明易受疾病爆发影响的社会经济部门对关心疾病爆发缓解措施的公民当局和保健工作者来说是一项重大挑战,这个问题传统上通过研究小规模保健数据的偏差来解决,在本文中,我们利用数据驱动模型,利用2000-2019年全球历史数据,确定世界发展指标趋势与疾病爆发发生之间的关系,将其作为典型的监管分类问题处理。基于抗逆转录病毒疗法的特征选择以非正统方式用于确定受疾病爆发影响的共变体,从而给最脆弱的部门带来影响。结果包括对不同的分类算法进行全面分析,并表明疾病爆发与各种发展指标规模之间的关系。