In the era of open data, Poisson and other count regression models are increasingly important. Still, conventional Poisson regression has remaining issues in terms of identifiability and computational efficiency. Especially, due to an identification problem, Poisson regression can be unstable for small samples with many zeros. Provided this, we develop a closed-form inference for an over-dispersed Poisson regression including Poisson additive mixed models. The approach is derived via mode-based log-Gaussian approximation. The resulting method is fast, practical, and free from the identification problem. Monte Carlo experiments demonstrate that the estimation error of the proposed method is a considerably smaller estimation error than the closed-form alternatives and as small as the usual Poisson regressions. For counts with many zeros, our approximation has better estimation accuracy than conventional Poisson regression. We obtained similar results in the case of Poisson additive mixed modeling considering spatial or group effects. The developed method was applied for analyzing COVID-19 data in Japan. This result suggests that influences of pedestrian density, age, and other factors on the number of cases change over periods.
翻译:在开放数据时代, Poisson 和其他计数回归模型越来越重要。 但是,常规的 Poisson 回归在可识别性和计算效率方面仍然存在问题。 特别是,由于识别问题, Poisson 回归对于许多零位样本的小型样本来说可能不稳定。 如果这样,我们就为超分散的Poisson 回归(包括Poisson 添加型混合模型)开发了一种封闭式推论。 这种方法是通过基于模式的日志-Gausian 近似推导出来的。 由此得出的方法是快速、实用和不受识别问题影响。 Monte Carlo 实验表明,拟议方法的估计误差远小于封闭式替代品,小于Poisson 通常的回归。 对于许多零位样本,我们的近似值比常规的Poisson 回归(包括Poisson 添加型混合模型)的精确度要好。 我们从Poisson 添加型混合模型中得出了类似的结果。 在分析日本COVID-19数据时采用了发达的方法。 其结果显示行人密度、 年龄和其他因素对不同时期变化案例数量的影响。