The problems of selecting partial correlation and causality graphs for count data are considered. A parameter driven generalized linear model is used to describe the observed multivariate time series of counts. Partial correlation and causality graphs corresponding to this model explain the dependencies between each time series of the multivariate count data. In order to estimate these graphs with tunable sparsity, an appropriate likelihood function maximization is regularized with an l1-type constraint. A novel MCEM algorithm is proposed to iteratively solve this regularized MLE. Asymptotic convergence results are proved for the sequence generated by the proposed MCEM algorithm with l1-type regularization. The algorithm is first successfully tested on simulated data. Thereafter, it is applied to observed weekly dengue disease counts from each ward of Greater Mumbai city. The interdependence of various wards in the proliferation of the disease is characterized by the edges of the inferred partial correlation graph. On the other hand, the relative roles of various wards as sources and sinks of dengue spread is quantified by the number and weights of the directed edges originating from and incident upon each ward. From these estimated graphs, it is observed that some special wards act as epicentres of dengue spread even though their disease counts are relatively low.
翻译:在计算数据时选择部分相关和因果关系图的问题得到了考虑。使用了一个参数驱动的通用通用线性模型来描述观察到的多变时间序列的计算。与该模型相应的部分相关和因果关系图形可以解释多变计数数据的每个时间序列之间的依赖性。为了估算这些带有金枪鱼分量的图表,一个适当的概率函数最大化以l1型限制来规范。提出了一个新的MCEM算法,以迭代方式解决这个正规化的 MLE。Asymptatic趋同结果被证明用于描述以 11 型身份正规化的拟议MCEM 算法生成的序列。该算法首先在模拟数据上成功测试。随后,该算用于从大孟买市每个选区观察每周登革热疾病计数。该疾病扩散的各个选区的相互依存性特征是推断的局部关联性图表的边缘。另一方面,各选区作为登革热传播源和汇的相对作用则以每个选区直接边缘和事件的次数和重量加以量化。从这些估计的图表中可以看出,每个选区的一些特殊选区是相对的,尽管它们相对而言,但是它们还是相对而言,它们相对而言,它们会观察到一些特殊的地形。