The Coronavirus Disease 2019 (COVID-19) has posed a severe threat to global human health and economic. It is an urgent task to build reliable data-driven prediction models for Covid 19 cases to improve public policy making. However, COVID-19 data shows special transmission characteristics such as significant fluctuations and non-stationarity, which may be difficult to be captured by a single predictive model and poses grand challenges in effective forecasting. In this paper, we proposed a novel Hybrid data-driven model combining Autoregressive model (AR) and long short-term memory neural networks (LSTM). It can be viewed as a new neural network model and the contribution of AR and LSTM is auto tuned in the training procedure. We conduct extensive numerical experiments on data collected from 8 counties of California that display various trends. The numerical results show the Hybrid model' advantages over AR and LSTM by its predictive powers. We show that the Hybrid model achieved 4.195\% MAPE, outperformed the AR 5.629\% and LSTM 5.070\% on average, and provide a discussion on interpretability.
翻译:2019年科罗纳病毒疾病(COVID-19)对全球人类健康和经济构成了严重威胁,为Covid 19个病例建立可靠的数据驱动预测模型是一项紧迫任务,以改善公共政策的制定,然而,COVID-19数据显示了特殊传输特征,如大幅波动和非常态,可能难以由单一预测模型捕捉,对有效预测构成巨大挑战。在本文件中,我们提出了一个新的混合数据驱动模型,将自动递增模型(AR)和长期记忆神经网络(LSTM)结合起来。它可以被视为一个新的神经网络模型,而AR和LSTM的贡献在培训过程中自动调整。我们对从加利福尼亚州8个州收集的显示各种趋势的数据进行了广泛的数字实验。数字结果显示,混合模型由于其预测能力,其优于AR和LSTM。我们表明,混合模型取得了4.195 ⁇ MAPE,平均超过AR 5.629 ⁇ 和LSTM 5.70 ⁇,并就可解释性进行了讨论。