We study a natural extension of classical empirical risk minimization, where the hypothesis space is a random subspace of a given space. In particular, we consider possibly data dependent subspaces spanned by a random subset of the data, recovering as a special case Nystrom approaches for kernel methods. Considering random subspaces naturally leads to computational savings, but the question is whether the corresponding learning accuracy is degraded. These statistical-computational tradeoffs have been recently explored for the least squares loss and self-concordant loss functions, such as the logistic loss. Here, we work to extend these results to convex Lipschitz loss functions, that might not be smooth, such as the hinge loss used in support vector machines. This unified analysis requires developing new proofs, that use different technical tools, such as sub-gaussian inputs, to achieve fast rates. Our main results show the existence of different settings, depending on how hard the learning problem is, for which computational efficiency can be improved with no loss in performance.
翻译:我们研究经典实验风险最小化的自然延伸, 假设空间是特定空间的随机子空间。 特别是, 我们考虑数据依赖的子空间可能由随机数据子集覆盖, 恢复为Nystrom方法的特例。 随机子空间自然会节省计算费用, 但问题是相应的学习准确性是否退化。 这些统计- 计算取舍取舍最近被探索了最小的平方损失和自我对应的损失功能, 如后勤损失。 在这里, 我们努力将这些结果扩展为可能不平稳的 convex Lipschitz 损失功能, 比如支持矢量机器所使用的断链损失。 这种统一分析需要开发新的证据, 使用不同的技术工具, 如亚高加索投入, 实现快速率。 我们的主要结果显示存在不同的环境, 取决于学习问题有多困难, 如何提高计算效率, 而不造成绩效损失 。