We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the local estimators, thus ensuring that key quantities such as local effective dimension and bias remain under control. We characterize the statistical-computational tradeoff of our model, and demonstrate the effectiveness of our method by numerical experiments on large-scale datasets.
翻译:我们引入了ParK, 新的内核脊回归的大型解决方案。 我们的方法将分割与随机预测和迭代优化结合起来,以减少空间和时间复杂性,同时可以理解地保持同样的统计准确性。 特别是, 我们直接在地物空间而不是输入空间中建造合适的分区, 我们提倡本地估测器之间的异位化, 从而确保本地有效维度和偏差等关键数量仍然在控制之中。 我们对模型的统计- 计算取舍进行定性, 并通过大规模数据集的数值实验来显示我们方法的有效性 。