We present a new nonparametric mixture-of-experts model for multivariate regression problems, inspired by the probabilistic k-nearest neighbors algorithm. Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point, yielding predictive distributions represented by Gaussian mixtures. Posterior inference is performed on the parameters of the mixture components as well as the distance metric using a mean-field variational Bayes algorithm accompanied with a stochastic gradient-based optimization procedure. The proposed method is especially advantageous in settings where inputs are of relatively high dimension in comparison to the data size, where input-output relationships are complex, and where predictive distributions may be skewed or multimodal. Computational studies on five datasets, of which two are synthetically generated, illustrate clear advantages of our mixture-of-experts method for high-dimensional inputs, outperforming competitor models both in terms of validation metrics and visual inspection.
翻译:我们提出了一个新的非参数专家混合模型,用于多变量回归问题,这种模型的灵感来自概率差的K-最近邻算法。使用一个有条件指定的模型,对超出抽样输入的预测基于与每个观察的数据点的相似性,产生高西亚混合物代表的预测分布。根据混合成分的参数和距离测量法,用一种中位变异波段算法,并辅之以一种随机梯度优化程序。在投入与数据大小相比具有较高维度、投入-产出关系复杂、预测分布可能偏斜或多式的环境下,拟议方法特别有利。对五套数据集的计算研究,其中两套是合成生成的,说明了我们用于高维投入的混合专家方法的明显优点,在鉴定指标和视觉检查方面均优于竞争模型。