We prove minimax optimal learning rates for kernel ridge regression, resp.~support vector machines based on a data dependent partition of the input space, where the dependence of the dimension of the input space is replaced by the fractal dimension of the support of the data generating distribution. We further show that these optimal rates can be achieved by a training validation procedure without any prior knowledge on this intrinsic dimension of the data. Finally, we conduct extensive experiments which demonstrate that our considered learning methods are actually able to generalize from a dataset that is non-trivially embedded in a much higher dimensional space just as well as from the original dataset.
翻译:我们证明对于内核脊回归(resp. ~ support tory) 来说,基于输入空间数据依赖分割的矢量机器的微小最佳学习率是最佳的,因为输入空间的维度的依赖性被支持数据生成分布的分形维度所取代。我们进一步表明,这些最佳学习率可以通过培训验证程序实现,而无需事先了解数据的这一内在维度。最后,我们进行了广泛的实验,表明我们所考虑的学习方法实际上能够从一个数据集中归纳出来,该数据集不仅嵌入原数据集,而且嵌入远高维空间。