Predictive models for binary data are fundamental in various fields, and the growing complexity of modern applications has motivated several flexible specifications for modeling the relationship between the observed predictors and the binary responses. A widely-implemented solution is to express the probability parameter via a probit mapping of a Gaussian process indexed by predictors. However, unlike for continuous settings, there is a lack of closed-form results for predictive distributions in binary models with Gaussian process priors. Markov chain Monte Carlo methods and approximation strategies provide common solutions to this problem, but state-of-the-art algorithms are either computationally intractable or inaccurate in moderate-to-high dimensions. In this article, we aim to cover this gap by deriving closed-form expressions for the predictive probabilities in probit Gaussian processes that rely either on cumulative distribution functions of multivariate Gaussians or on functionals of multivariate truncated normals. To evaluate these quantities we develop novel scalable solutions based on tile-low-rank Monte Carlo methods for computing multivariate Gaussian probabilities, and on mean-field variational approximations of multivariate truncated normals. Closed-form expressions for the marginal likelihood and for the posterior distribution of the Gaussian process are also discussed. As shown in simulated and real-world empirical studies, the proposed methods scale to dimensions where state-of-the-art solutions are impractical.
翻译:二进制数据预测模型在各个领域都具有根本基础,现代应用的日益复杂程度促使对观测到的预测器和二进制反应之间的关系进行建模的几种灵活规格。一个广泛实施的解决办法是通过预测器指数化的高斯过程的比比方映射来表达概率参数。然而,与连续的设置不同,在使用高斯进程前列的二进制模型中,缺少用于预测分布的封闭形式结果。Markov连锁的蒙特卡洛方法和近似战略为这一问题提供了共同的解决办法,但最先进的算法要么在计算中高斯进程和二进制反应器之间的关系时难以计算或不准确。在本篇文章中,我们的目标是通过对正数高斯进程的预测概率绘制封闭形式表达方式来填补这一空白,因为正数高斯进程的累积性功能或多变式变式变式变式正常的功能。为了评估这些数量,我们根据低调的蒙特卡洛最新式算法在计算多进制变化的常态、变化的变化性模型的变化和变化模型的变化性模型,在亚易变化的变化的变化模型和变化模型中变化模型上,在变化的变化的变化的变化的变化的变化的变化的变式的变式方法是易变式。