Despite recent advances in automated machine learning, model selection is still a complex and computationally intensive process. For Gaussian processes (GPs), selecting the kernel is a crucial task, often done manually by the expert. Additionally, evaluating the model selection criteria for Gaussian processes typically scales cubically in the sample size, rendering kernel search particularly computationally expensive. We propose a novel, efficient search method through a general, structured kernel space. Previous methods solved this task via Bayesian optimization and relied on measuring the distance between GP's directly in function space to construct a kernel-kernel. We present an alternative approach by defining a kernel-kernel over the symbolic representation of the statistical hypothesis that is associated with a kernel. We empirically show that this leads to a computationally more efficient way of searching through a discrete kernel space.
翻译:尽管在自动化机器学习方面最近有所进展,但模型选择仍然是一个复杂和计算密集的过程。对于Gaussian过程(GPs)来说,选择内核是一项关键的任务,通常由专家手工完成。此外,对Gaussian过程的模型选择标准的评价通常以样本大小为不同尺度,使得内核搜索在计算上特别昂贵。我们提出了一个通过一般的、结构化的内核空间进行新的、高效的搜索方法。以前的方法通过Bayesian优化解决了这项任务,并依靠测量GP直接在功能空间之间的距离来建造内核内核。我们提出了另一种办法,即对与内核相关的统计假设的象征性表示进行内核定义。我们从经验上表明,通过离心空间进行搜索的计算效率更高。