Sketching is a dimensionality reduction technique where one compresses a matrix by linear combinations that are chosen at random. A line of work has shown how to sketch the Hessian to speed up each iteration in a second order method, but such sketches usually depend only on the matrix at hand, and in a number of cases are even oblivious to the input matrix. One could instead hope to learn a distribution on sketching matrices that is optimized for the specific distribution of input matrices. We show how to design learned sketches for the Hessian in the context of second order methods. We prove that a smaller sketching dimension of the column space of a tall matrix is possible, given an oracle that can predict the indices of the rows of large leverage score. We design such an oracle for various datasets, and this leads to a faster convergence of the well-studied iterative Hessian sketch procedure, which applies to a wide range of problems in convex optimization. We show empirically that learned sketches, compared with their "non-learned" counterparts, do improve the approximation accuracy for important problems, including LASSO and matrix estimation with nuclear norm constraints.
翻译:解剖是一种维度减少技术, 一个人可以随机地通过线性组合压缩一个矩阵, 随机地选择。 工作线条已经展示了如何用第二顺序方法绘制赫森人草图, 以加快每个迭代的速度, 但这种草图通常只依靠手头的矩阵, 而在很多情况下, 甚至忽略输入矩阵。 相反, 人们可以希望学习关于草图矩阵的分布, 以优化的方式分配输入矩阵的具体分布。 我们展示了如何在第二顺序方法中为赫森人设计所学的草图。 我们证明, 高位矩阵的柱体空间有一个较小的草图层尺寸是可能的, 因为一个神仙可以预测大杠杆分数行的指数。 我们为各种数据集设计了一个这样的标尺, 从而可以更快地将深思熟虑的迭接合的热画图程序( 适用于对等式优化的广泛问题 ) 。 我们从经验上显示, 与它们的“ 非学” 对应者相比, 能够提高重要问题的近似精确度, 包括LASSO 和 矩阵 和核规范的精确度 。