The stable under iterated tessellation (STIT) process is a stochastic process that produces a recursive partition of space with cut directions drawn independently from a distribution over the sphere. The case of random axis-aligned cuts is known as the Mondrian process. Random forests and Laplace kernel approximations built from the Mondrian process have led to efficient online learning methods and Bayesian optimization. In this work, we utilize tools from stochastic geometry to resolve some fundamental questions concerning STIT processes in machine learning. First, we show that a STIT process with cut directions drawn from a discrete distribution can be efficiently simulated by lifting to a higher dimensional axis-aligned Mondrian process. Second, we characterize all possible kernels that stationary STIT processes and their mixtures can approximate. We also give a uniform convergence rate for the approximation error of the STIT kernels to the targeted kernels, generalizing the work of [3] for the Mondrian case. Third, we obtain consistency results for STIT forests in density estimation and regression. Finally, we give a formula for the density estimator arising from an infinite STIT random forest. This allows for precise comparisons between the Mondrian forest, the Mondrian kernel and the Laplace kernel in density estimation. Our paper calls for further developments at the novel intersection of stochastic geometry and machine learning.
翻译:在迭代贝氏(STIT)进程下的稳定状态下,是一个稳定的透析过程,产生空间的循环分割过程,其分流方向与球体分布分开。随机轴拉动切割的例子被称为蒙德里安进程。由蒙德里安进程建造的随机森林和拉普尔内核近距离导致高效的在线学习方法和巴耶斯优化。在这项工作中,我们利用从随机透析的几何测量工具来解决与机器学习中的科技创新技术进程有关的一些根本问题。首先,我们表明,从离散分布中取出的剪切方向的科技创新技术进程可以通过提升到更高维度轴拉动蒙德里安进程来有效地模拟。第二,我们辨别了所有可能由蒙德里安科技进程及其混合物构成的内核。我们还为科技创新技术创新核心与目标核心之间的近似误差提供了统一的趋同率,概括了[3]Mondrian案件的工作。第三,我们从密度估计和回归中获得了科技创新技术创新技术创新森林的一致性结果。最后,我们给出了一种用于对森林密度进行随机对比的数学和数学测深度的公式,用于在森林的数学和数学中进行。