Bayesian methods are a popular choice for statistical inference in small-data regimes due to the regularization effect induced by the prior, which serves to counteract overfitting. In the context of density estimation, the standard Bayesian approach is to target the posterior predictive. In general, direct estimation of the posterior predictive is intractable and so methods typically resort to approximating the posterior distribution as an intermediate step. The recent development of recursive predictive copula updates, however, has made it possible to perform tractable predictive density estimation without the need for posterior approximation. Although these estimators are computationally appealing, they tend to struggle on non-smooth data distributions. This is largely due to the comparatively restrictive form of the likelihood models from which the proposed copula updates were derived. To address this shortcoming, we consider a Bayesian nonparametric model with an autoregressive likelihood decomposition and Gaussian process prior, which yields a data-dependent bandwidth parameter in the copula update. Further, we formulate a novel parameterization of the bandwidth using an autoregressive neural network that maps the data into a latent space, and is thus able to capture more complex dependencies in the data. Our extensions increase the modelling capacity of existing recursive Bayesian density estimators, achieving state-of-the-art results on tabular data sets.
翻译:在小型数据体系中,贝叶斯方法是一种流行的统计推断方法,因为先前数据引发的正规化效应,有助于抵消过度的调整。在密度估计方面,标准的贝叶斯方法是针对后方预测的。一般而言,对后方预测的直接估计是棘手的,因此方法通常采用近于后方分布的中间步骤。然而,最近开发的循环预测性相交系统更新使得有可能进行可移动的预测性密度估计,而无需事后近似。虽然这些估计数据具有计算吸引力,但它们倾向于在非移动数据分布上挣扎。这主要是因为对作为拟议相交系统更新的模型的概率模型形式比较严格。为了解决这一缺陷,我们认为贝伊斯的非参数是非参数,具有自动递增的可能性降解和测量过程之前,从而在对相近数据更新过程中产生一个数据依赖性的带宽度参数。此外,我们利用非移动性数据分布式数据分布式的模型,从而能够使现有数据稳定地将数据升级到复合的楼层网络。