关于 " 超零 " 计算结果统计模型绩效的模拟研究</s> (A Simulation Study of the Performance of Statistical Models for Count Outcomes with Excessive Zeros)

Background: Outcome measures that are count variables with excessive zeros are common in health behaviors research. There is a lack of empirical data about the relative performance of prevailing statistical models when outcomes are zero-inflated, particularly compared with recently developed approaches. Methods: The current simulation study examined five commonly used analytical approaches for count outcomes, including two linear models (with outcomes on raw and log-transformed scales, respectively) and three count distribution-based models (i.e., Poisson, negative binomial, and zero-inflated Poisson (ZIP) models). We also considered the marginalized zero-inflated Poisson (MZIP) model, a novel alternative that estimates the effects on overall mean while adjusting for zero-inflation. Extensive simulations were conducted to evaluate their the statistical power and Type I error rate across various data conditions. Results: Under zero-inflation, the Poisson model failed to control the Type I error rate, resulting in higher than expected false positive results. When the intervention effects on the zero (vs. non-zero) and count parts were in the same direction, the MZIP model had the highest statistical power, followed by the linear model with outcomes on raw scale, negative binomial model, and ZIP model. The performance of a linear model with a log-transformed outcome variable was unsatisfactory. When only one of the effects on the zero (vs. non-zero) part and the count part existed, the ZIP model had the highest statistical power. Conclusions: The MZIP model demonstrated better statistical properties in detecting true intervention effects and controlling false positive results for zero-inflated count outcomes. This MZIP model may serve as an appealing analytical approach to evaluating overall intervention effects in studies with count outcomes marked by excessive zeros.

翻译：背景:在健康行为研究中常见的结果计量为计数变量,其数值为零度过高;在结果为零膨胀时,缺乏关于现行统计模型相对性能的经验数据,尤其是与最近制定的方法相比。方法:当前模拟研究审查了五种常用的计数结果分析方法,包括两种线性模型(分别以原始和日志转换尺度计算结果)和三种计数分布模型(即Poisson、负二进制和零膨胀的Poisson(ZIP)模型);我们还审议了在结果为零膨胀的Poisson(MZIP)模型的边缘化零膨胀Poisson(MZIP)模型的相对性能经验数据;一种新颖的替代方法,即估计总体平均值对总体平均值的影响,同时调整零膨胀。进行了广泛的模拟研究,以评估其统计实力和各种数据条件下的I型误差率。结果:在零膨胀中,Poisson模型无法控制I型误差率率,结果比预期高。当干预对零(v.非零)和计数的干预效应对零的干预效果影响时,在非方向上,MZIIP的计算结果为同一方向,而计算结果的计算结果为正态分析结果模型的模型显示为正态结果。这个模型的模型的模型显示的数值结果。这一模型显示一个最高级结果。</s>