As the coronavirus disease 2019 (COVID-19) continues to be a global pandemic, policy makers have enacted and reversed non-pharmaceutical interventions with various levels of restrictions to limit its spread. Data driven approaches that analyze temporal characteristics of the pandemic and its dependence on regional conditions might supply information to support the implementation of mitigation and suppression strategies. To facilitate research in this direction on the example of the United States, we present a machine-readable dataset that aggregates relevant data from governmental, journalistic, and academic sources on the U.S. county level. In addition to county-level time-series data from the JHU CSSE COVID-19 Dashboard, our dataset contains more than 300 variables that summarize population estimates, demographics, ethnicity, housing, education, employment and income, climate, transit scores, and healthcare system-related metrics. Furthermore, we present aggregated out-of-home activity information for various points of interest for each county, including grocery stores and hospitals, summarizing data from SafeGraph and Google mobility reports. We compile information from IHME, state and county-level government, and newspapers for dates of the enactment and reversal of non-pharmaceutical interventions. By collecting these data, as well as providing tools to read them, we hope to accelerate research that investigates how the disease spreads and why spread may be different across regions. Our dataset and associated code are available at github.com/JieYingWu/COVID-19_US_County-level_Summaries.
翻译:由于2019年的冠状病毒疾病(COVID-19)继续成为全球流行病,决策者制定和扭转了非制药干预,限制其扩散,限制其扩散;分析该流行病的时间特征及其对区域条件依赖的数据驱动方法可能为支持执行减缓和抑制战略提供信息;为了便利这方面的研究,以美国为例,我们提供了一个机器可读数据集,汇总来自政府、新闻和学术来源的美国县一级的相关数据;除了JHU CHSE COVID-19 Dashboard提供的县一级时间序列数据外,我们的数据集包含300多个变量,这些变量总结了人口估计数、人口、族裔、住房、教育、就业和收入、气候、过境分数和与保健系统有关的计量;此外,我们为每个州(包括食品店和医院)各感兴趣的各点提供了汇总的家外活动信息,汇总了来自SafeGraph和Google移动报告的数据;我们汇编了来自IHME、州和县一级政府提供的时间序列数据,以及报纸为我们在各种研究领域收集和扭转这些数据的日期,我们如何将这些数据作为传播工具来阅读这些工具。