Data analysis and individual policy-level modeling for insurance involves handling large data sets with strong spatiotemporal correlations, non-Gaussian distributions, and complex hierarchical structures. In this research, we demonstrate that by utilizing gradient-based Markov chain Monte Carlo (MCMC) techniques accelerated by graphics processing units, the trade-off between complex model structures and scalability for inference is overcome at the million-record size. By implementing our model in NumPyro, we leverage its built-in MCMC capabilities to fit a model with multiple sophisticated components such as latent conditional autoregression and spline-based exposure adjustment, achieving an 8.8x speedup compared to CPU-based implementations. We apply this model to a case study of 2.6 million individual policy-level claim count records for automobile insurance from Brazil in 2011. We illustrate how this modeling approach significantly advances current risk assessment processes for numerous, closely related outcomes. The code and data are available at https://github.com/ckrapu/bayes-at-scale.
翻译:暂无翻译