模拟计数数据中的超额零数:关于模拟方法的新视角 (Modelling excess zeros in count data: A new perspective on modelling approaches)

We consider the analysis of count data in which the observed frequency of zero counts is unusually large, typically with respect to the Poisson distribution. We focus on two alternative modelling approaches: Over-Dispersion (OD) models, and Zero-Inflation (ZI) models, both of which can be seen as generalisations of the Poisson distribution; we refer to these as Implicit and Explicit ZI models, respectively. Although sometimes seen as competing approaches, they can be complementary; OD is a consequence of ZI modelling, and ZI is a by-product of OD modelling. The central objective in such analyses is often concerned with inference on the effect of covariates on the mean, in light of the apparent excess of zeros in the counts. Typically the modelling of the excess zeros per se is a secondary objective and there are choices to be made between, and within, the OD and ZI approaches. The contribution of this paper is primarily conceptual. We contrast, descriptively, the impact on zeros of the two approaches. We further offer a novel descriptive characterisation of alternative ZI models, including the classic hurdle and mixture models, by providing a unifying theoretical framework for their comparison. This in turn leads to a novel and technically simpler ZI model. We develop the underlying theory for univariate counts and touch on its implication for multivariate count data.

翻译：我们考虑对计数数据的分析,观察到零计数的频率异乎寻常地大,典型的是Poisson分布,我们侧重于两种替代建模方法:超分散模型和零通货膨胀模型,这两种模型都可被视为Poisson分布的概括性;我们将这些模型分别称为隐含和模糊的ZI模型,虽然有时被视为相互竞争的方法,但它们可以相互补充;OD是ZI建模的结果,而ZI是OD建模的副产品。这种分析的中心目标往往涉及从数值上推论共变对平均值的影响:超分散模型和零通货膨胀模型;典型地说,单项零的建模是一个次要目标,在OD和ZI方法之间可以作出选择。尽管有时被视为相互竞争的方法,但OD可以相互补充;OD是ZI建模的结果,而ZI是OD建模的副产品。我们对这种分析的中心目标往往涉及从数值角度推断出对平均值的影响,考虑到在数值上明显超过零的数值;一般说来,将ZI的建模模型的模型变成一个基础性模型。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/