In epidemiological studies, zero-inflated and hurdle models are commonly used to handle excess zeros in reported infectious disease cases. However, they can not model the persistence (from presence to presence) and reemergence (from absence to presence) of a disease separately. Covariates can sometimes have different effects on the reemergence and persistence of a disease. Recently, a zero-inflated Markov switching negative binomial model was proposed to accommodate this issue. We present a Markov switching negative binomial hurdle model as a competitor of that approach, as hurdle models are often also used as alternatives to zero-inflated models for accommodating excess zeroes. We begin the comparison by inspecting the underlying assumptions made by both models. Hurdle models assume perfect detection of the disease cases while zero-inflated models implicitly assume the case counts can be under-reported, thus we investigate when a negative binomial distribution can approximate the true distribution of reported counts. A comparison of the fit of the two types of Markov switching models is undertaken on chikungunya cases across the neighborhoods of Rio de Janeiro. We find that, among the fitted models, the Markov switching negative binomial zero-inflated model produces the best predictions and both Markov switching models produce remarkably better predictions than more traditional negative binomial hurdle and zero-inflated models.
翻译:暂无翻译