Deterministic Gaussian approximations of intractable posterior distributions are common in Bayesian inference. From an asymptotic perspective, a theoretical justification in regular parametric settings is provided by the Bernstein-von Mises theorem. However, such a limiting behavior may require a large sample size before becoming visible in practice. In fact, in situations with small-to-moderate sample size, even simple parametric models often yield posterior distributions which are far from resembling a Gaussian shape, mainly due to skewness. In this article, we provide rigorous theoretical arguments for such a behavior by deriving a novel limiting law that coincides with a closed-form and tractable skewed generalization of Gaussian densities, and yields a total variation distance from the exact posterior whose convergence rate crucially improves by a $\sqrt{n}$ factor the one obtained under the classical Bernstein-von Mises theorem based on limiting Gaussians. In contrast to higher-order approximations (which require finite truncations for inference, possibly leading to even negative densities), our theory characterizes the limiting behavior of Bayesian posteriors with respect to a sequence of valid and tractable densities. This further motivates a practical plug-in version which replaces the unknown model parameters with the corresponding MAP estimate to obtain a novel skew-modal approximation achieving the same improved rate of convergence of its population counterpart. Extensive quantitative studies confirm that our new theory closely matches the empirical behavior observed in practice even in finite, possibly small, sample regimes. The proposed skew-modal approximation further exhibits improved accuracy not only relative to classical Laplace approximation, but also with respect to state-of-the-art approximations from mean-field variational Bayes and expectation-propagation.
翻译:在Bayesian 的推论中, 常见的是难以调和的海边分布的确定性高比近似近似值。 从无症状的角度来看, Bernstein- von Mises 理论提供了常规参数设置的理论依据。 然而, 这样的限制行为可能需要在实际中能见度之前, 抽样规模要大得多。 事实上, 在小到中度样本大小的情况下, 即使是简单的参数模型也往往产生远不是重塑高山形状的后部分布值, 主要是由于偏斜。 在文章中, 我们为这种行为提供了严格的理论论据, 通过产生一种新的限制性法律, 这与高山洞密度的封闭式和可移动性一般化一致一致, 使得与精确的海脊的趋异性相比, 其趋近率的趋近率率率比, 只能以限制高山丘的正比值取代小的正数的正数的正数, 更接近于更接近的直位变异性。 相比之下, 更接近于更接近更接近的更接近的更接近于更接近的更接近的更接近的精确的精确的精确的直系,, 也就是的直系的直系的直系的直系的直系的直系, 直系的直系的直系, 直系的直系的直系的直系的直系, 直系的直系的直系的直系的直到直到直系的直系的直系的直系的直系, 直系, 直系的直系的直系的直系, 直系的直系的直系的直系的直系, 直系的直系的直系的直系的直系的直系的直系的直系的直系的直系, 直系的直系, 直系的直系, 直系, 直系的直系, 直系, 直系的直系的直系直系直系的直系直系直系直系直系的直系直系直系直系直系直系的直系的直系直系直系的直系直系的直系直系直系直系直系直系的直系的直系直系直系直系直系直系直系