美国州一级预测COVID-19发病率的时空机器学习方法 (A spatiotemporal machine learning approach to forecasting COVID-19 incidence at the county level in the United States)

With COVID-19 affecting every country globally and changing everyday life, the ability to forecast the spread of the disease is more important than any previous epidemic. The conventional methods of disease-spread modeling, compartmental models, are based on the assumption of spatiotemporal homogeneity of the spread of the virus, which may cause forecasting to underperform, especially at high spatial resolutions. In this paper we approach the forecasting task with an alternative technique -- spatiotemporal machine learning. We present COVID-LSTM, a data-driven model based on a Long Short-term Memory deep learning architecture for forecasting COVID-19 incidence at the county-level in the US. We use the weekly number of new positive cases as temporal input, and hand-engineered spatial features from Facebook movement and connectedness datasets to capture the spread of the disease in time and space. COVID-LSTM outperforms the COVID-19 Forecast Hub's Ensemble model (COVIDhub-ensemble) on our 17-week evaluation period, making it the first model to be more accurate than the COVIDhub-ensemble over one or more forecast periods. Over the 4-week forecast horizon, our model is on average 50 cases per county more accurate than the COVIDhub-ensemble. We highlight that the underutilization of data-driven forecasting of disease spread prior to COVID-19 is likely due to the lack of sufficient data available for previous diseases, in addition to the recent advances in machine learning methods for spatiotemporal forecasting. We discuss the impediments to the wider uptake of data-driven forecasting, and whether it is likely that more deep learning-based models will be used in the future.

翻译：随着COVID-19在全球影响到每个国家,日常生活也不断发生变化,预测疾病传播的能力比以往任何流行病都更加重要。疾病传播模型的常规方法,即条形模型,是以病毒传播的零星同质性假设为基础的,这可能导致预报工作表现不佳,特别是在高空间分辨率方面。在本文中,我们用一种替代技术 -- -- 空间机器学习 -- -- 来应对预报任务。我们介绍了COVID-LSTM,这是基于长期短期记忆深层学习结构的数据驱动模型,用以在美国县一级预测COVID-19的发病率。我们使用每周新阳性病例的数量作为时间输入模型,以及Facebook运动和连接性疾病传播的手动空间特征,以捕捉疾病在时间和空间的蔓延。CVID-LSTM比COV-19预报中心17周期的合成模型(COVID-HUB-Comminib-commble)要更精确地显示我们17周评价期的预测,因此第一个模型比COVID-19级预测的频率要更精确,在前期的预测中比CVI-OV-O-O-O-O-O-O-O-O-O-I-O-O-O-I-O-O-O-VI-I-I-I-I-I-I-I-I-I-I-I-I-V-I-V-I-I-I-V-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-IVV-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I