In this paper, we propose novel Gaussian process-gated hierarchical mixtures of experts (GPHMEs) that are used for building gates and experts. Unlike in other mixtures of experts where the gating models are linear to the input, the gating functions of our model are inner nodes built with Gaussian processes based on random features that are non-linear and non-parametric. Further, the experts are also built with Gaussian processes and provide predictions that depend on test data. The optimization of the GPHMEs is carried out by variational inference. There are several advantages of the proposed GPHMEs. One is that they outperform tree-based HME benchmarks that partition the data in the input space. Another advantage is that they achieve good performance with reduced complexity. A third advantage of the GPHMEs is that they provide interpretability of deep Gaussian processes and more generally of deep Bayesian neural networks. Our GPHMEs demonstrate excellent performance for large-scale data sets even with quite modest sizes.
翻译:在本文中,我们提出了用于建造大门和专家的新颖的高斯进程等级混合专家(GPHMEs)建议。与其他专家混合物不同,在专家混合物中,格子模型是线性到输入的,我们模型的格子功能是内节点,与高斯过程建立在非线性和非参数性随机特征的基础上。此外,专家也与高斯过程一起建立,提供取决于测试数据的预测。GPHMEs的优化是通过变异的推断进行的。拟议的GPHMEs有若干优点。其中一个优点是,它们超越了基于树的HME基准,在输入空间中将数据分隔开来。另一个优点是,它们能够以较低的复杂性取得良好的性能。GPHMEs的第三个优点是,它们提供了深高斯过程的可解释性,以及更一般地说,是巴伊西亚神经网络的深处。我们的GPHMEs展示了大型数据集的出色性能,即使尺寸相当小。