By distributing the training process, local approximation reduces the cost of the standard Gaussian Process. An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data by assuming a perfect diversity of local predictors. Although it keeps the aggregation tractable, this assumption is often violated in practice. Taking dependencies between experts enables ensemble methods to provide consistent results. However, they have a high computational cost, which is cubic in the number of experts involved. By implementing an expert selection strategy, the final aggregation step uses fewer experts and is more efficient. Indeed, a static selection approach that assigns a fixed set of experts to each new data point cannot encode the specific properties of each unique data point. This paper proposes a flexible expert selection approach based on the characteristics of entry data points. To this end, we investigate the selection task as a multi-label classification problem where the experts define labels, and each entry point is assigned to some experts. The proposed solution's prediction quality, efficiency, and asymptotic properties are discussed in detail. We demonstrate the efficacy of our method through extensive numerical experiments using synthetic and real-world data sets.
翻译:当地近似法通过分配培训过程,降低了标准高斯进程的成本。一种混合技术将高斯专家对数据不同分区的当地预测结合起来,假设当地预测器的完美多样性。虽然它保持了总体可动性,但在实践中,这一假设常常被违反。专家之间的依赖性使得共同方法能够提供一致的结果。然而,专家的计算成本很高,这是所涉专家人数的立方。通过执行专家选择战略,最后的汇总步骤使用的专家较少,效率更高。事实上,为每个新数据点指派固定专家的静态选择方法无法对每个独特数据点的具体特性进行编码。本文建议根据进入数据点的特点采取灵活的专家选择方法。为此,我们将选择任务作为一个多标签分类问题进行研究,专家在其中定义标签,每个入口点指定给一些专家。我们详细讨论了拟议的解决办法的预测质量、效率以及设计特性。我们通过使用合成和现实世界数据集进行广泛的数字实验,展示了我们方法的功效。