Generalized linear models (GLMs) arguably represent the standard approach for statistical regression beyond the Gaussian likelihood scenario. When Bayesian formulations are employed, the general absence of a tractable posterior distribution has motivated the development of deterministic approximations, which are generally more scalable than sampling techniques. Among them, expectation propagation (EP) showed extreme accuracy, usually higher than many variational Bayes solutions. However, the higher computational cost of EP posed concerns about its practical feasibility, especially in high-dimensional settings. We address these concerns by deriving a novel efficient formulation of EP for GLMs, whose cost scales linearly in the number of covariates p. This reduces the state-of-the-art O(p^2 n) per-iteration computational cost of the EP routine for GLMs to O(p n min{p,n}), with n being the sample size. We also show that, for binary models and log-linear GLMs approximate predictive means can be obtained at no additional cost. To preserve efficient moment matching for count data, we propose employing a combination of log-normal Laplace transform approximations, avoiding numerical integration. These novel results open the possibility of employing EP in settings that were believed to be practically impossible. Improvements over state-of-the-art approaches are illustrated both for simulated and real data. The efficient EP implementation is available at https://github.com/niccoloanceschi/EPglm.
翻译:暂无翻译