A sketching algorithm is a way to solve an optimization problem approximately and in a fraction of the usual time. We consider classical sketching algorithms which first compress data by multiplication with a random "sketch matrix". Our work improves and extends the "learned sketch" paradigm, in which sketch matrices are optimized to yield better expected performance. This technique has only been used for a suboptimal variant of sketched low-rank decomposition (LRD). Our work extends the problem coverage to optimal sketched LRD, least-squares regression (LS), and $k$-means clustering. We improve sketch learning for all three problems and very significantly for LS and LRD: experimental performance increases by $12\%$ and $20\%$, respectively. (Interestingly, we can also prove that we get a strict improvement for LRD under certain conditions.) Finally, we design two sketching algorithm modifications that leverage the strong expected performance of learned sketches, provide worst-case performance guarantees, and have the same time complexity as classical sketching. We prove the worst-case property for each of the problems and their modified algorithms.
翻译:素描算法是大约和在通常时间的一小部分解决优化问题的一种方法。 我们考虑古典素描算法,它首先通过随机的“ 缓冲矩阵” 来压缩数据。 我们的工作改进并扩展了“ 缓冲矩阵” 模式, 使素描矩阵优化以产生更好的预期性能。 这个技术只用于草画低声分解的亚优异变体( LRD ) 。 我们的工作将问题覆盖扩大到最优化的草画LRD, 最低方位回归( LS) 和 $k$- means 群集。 我们改进了所有三个问题的素描算法学习, 特别是LS 和 LRD : 实验性能分别增加 12 $和 20 $ $ 。 ( 远方说, 我们还可以证明在某些条件下, LRD 得到严格的改进 。 最后, 我们设计了两种粗略的算法修改方法, 利用素描图的强预期性能, 提供最坏的性能保证, 和古典素描写法一样复杂。