We study a natural extension of classical empirical risk minimization, where the hypothesis space is a random subspace of a given space. In particular, we consider possibly data dependent subspaces spanned by a random subset of the data, recovering as a special case Nystr\"om approaches for kernel methods. Considering random subspaces naturally leads to computational savings, but the question is whether the corresponding learning accuracy is degraded. These statistical-computational tradeoffs have been recently explored for the least squares loss and self-concordant loss functions, such as the logistic loss. Here, we work to extend these results to convex Lipschitz loss functions, that might not be smooth, such as the hinge loss used in support vector machines. This extension requires developing new proofs, that use different technical tools. Our main results show the existence of different settings, depending on how hard the learning problem is, for which computational efficiency can be improved with no loss in performance. Theoretical results are illustrated with simple numerical experiments.
翻译:我们研究经典实验风险最小化的自然延伸, 假设空间是特定空间的随机子空间。 特别是, 我们考虑数据依赖的子空间可能通过数据随机子集覆盖, 恢复为 Nystr\\"om 方法内核方法的特殊案例。 随机子空间自然会节省计算费用, 但问题是相应的学习准确性是否退化。 这些统计- 计算平衡最近已经探索了最小方位损失和自我匹配损失功能, 如后勤损失。 在这里, 我们努力将这些结果扩展为可能不平稳的 convex Lipschitz 损失功能, 比如支持矢量机器中使用的断链损失功能。 这个扩展需要开发新的证据, 使用不同的技术工具。 我们的主要结果显示存在不同的环境, 取决于学习问题的难度如何, 其计算效率可以提高, 而不造成任何损失。 理论结果可以用简单的数字实验来说明 。