At the end of April 20, 2020, there were only a few new COVID-19 cases remaining in China, whereas the rest of the world had shown increases in the number of new cases. It is of extreme importance to develop an efficient statistical model of COVID-19 spread, which could help in the global fight against the virus. We propose a clustering-segmented autoregressive sigmoid (CSAS) model to explore the space-time pattern of the log-transformed infectious count. Four key characteristics are included in this CSAS model, including unknown clusters, change points, stretched S-curves, and autoregressive terms, in order to understand how this outbreak is spreading in time and in space, to understand how the spread is affected by epidemic control strategies, and to apply the model to updated data from an extended period of time. We propose a nonparametric graph-based clustering method for discovering dissimilarity of the curve time series in space, which is justified with theoretical support to demonstrate how the model works under mild and easily verified conditions. We propose a very strict purity score that penalizes overestimation of clusters. Simulations show that our nonparametric graph-based clustering method is faster and more accurate than the parametric clustering method regardless of the size of data sets. We provide a Bayesian information criterion (BIC) to identify multiple change points and calculate a confidence interval for a mean response. By applying the CSAS model to the collected data, we can explain the differences between prevention and control policies in China and selected countries.
翻译:在2020年4月20日的4月20日,中国只剩下几个新的COVID-19案例,而世界其他国家则显示新案例数量有所增加。至关重要的是开发一个高效的COVID-19扩散统计模型,这可有助于全球抗击病毒的斗争。我们建议采用一个集群化的自动递增缩缩缩胶片模型(CSAS),以探索日志转换传染性计数的时时模式。CSAS模型包括四个关键特征,包括未知的组别、变化点、超长的S曲线和自动递增术语,以便了解这种爆发是如何在时间和空间上蔓延的,了解这种扩散如何受到流行病控制战略的影响,并利用这一模型从长时期更新数据。我们提出了一种基于非参数的图形组合方法,以发现空间曲线时间序列的不相似性。我们提出理论支持是合理的,以证明模型如何在温和容易核实的条件下运作。我们提议一个非常严格的纯度分级分级分级,以惩罚这种在时间和跨时间间隔的组别中传播的疫情是如何传播的,而不论如何采用精确的基数的基数数据分组的缩缩缩缩度标准。我们用的方法提供了一种非参数的计算方法。我们用非直截式的计算方法可以用来确定一个非直截式数据基数的计算方法。