In this paper, we apply statistical methods for functional data to explain the heterogeneity in the evolution of number of deaths of Covid-19 over different regions. We treat the cumulative daily number of deaths in a specific region as a curve (functional data) such that the data comprise of a set of curves over a cross-section of locations. We start by using clustering methods for functional data to identify potential heterogeneity in the curves and their functional derivatives. This first stage is an unconditional descriptive analysis, as we do not use any covariate to estimate the clusters. The estimated clusters are analyzed as "levels of alert" to identify cities in a possible critical situation. In the second and final stage, we propose a functional quantile regression model of the death curves on a number of scalar socioeconomic and demographic indicators in order to investigate their functional effects at different levels of the cumulative number of deaths over time. The proposed model showed a superior predictive capacity by providing better curve fit at different levels of the cumulative number of deaths compared to the functional regression model based on ordinary least squares.
翻译:在本文中,我们运用功能数据统计方法来解释不同区域Covid-19死亡人数变化的异质性;我们把特定区域的每日累计死亡人数作为曲线(功能数据)处理,这样数据就包含一系列跨区分布的曲线;我们首先使用功能数据分组方法来确定曲线及其功能衍生物的潜在异质性;第一阶段是无条件的描述性分析,因为我们没有使用任何变量来估计组群。估计的组群被分析为“警戒水平”,以识别可能处于危急状态的城市。在第二和第三阶段,我们提议一个死亡曲线的功能微量回归模型,用于若干星际社会经济和人口指标,以调查它们在累积死亡人数不同水平上的功能效应。拟议模型显示一个更高级的预测能力,提供更好的曲线,与基于普通最低方位的功能回归模型相比,不同层次的累积死亡人数更适合。