We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression that are comparable to their quantum analogues. De-quantizing such algorithms has received a flurry of attention in recent years; we obtain sharper bounds for these problems. More significantly, we achieve these improvements by arguing that the previous quantum-inspired algorithms for these problems are doing leverage or ridge-leverage score sampling in disguise. With this recognition, we are able to employ the large body of work in numerical linear algebra to obtain algorithms for these problems that are simpler and faster than existing approaches. We also consider static data structures for the above problems, and obtain close-to-optimal bounds for them. To do this, we introduce a new randomized transform, the Gaussian Randomized Hadamard Transform (GRHT). It was thought in the numerical linear algebra community that to obtain nearly-optimal bounds for various problems such as rank computation, finding a maximal linearly independent subset of columns, regression, low rank approximation, maximum matching on general graphs and linear matroid union, that one would need to resolve the main open question of Nelson and Nguyen (FOCS, 2013) regarding the logarithmic factors in existing oblivious subspace embeddings. We bypass this question, using GRHT, and obtain optimal or nearly-optimal bounds for these problems. For the fundamental problems of rank computation and finding a linearly independent subset of columns, our algorithms improve Cheung, Kwok, and Lau (JACM, 2013) and are optimal to within a constant factor and a $\log\log(n)$-factor, respectively. Further, for constant factor regression and low rank approximation we give the first optimal algorithms, for the current matrix multiplication exponent.
翻译:我们创建了古典(非量级)动态数据结构,用于查询建议系统和最小平方回归值,这些系统与数量类比相似。 量化的这种算法近年来引起了人们的注意; 我们为这些问题获得了更清晰的界限。 更重要的是, 我们之所以能实现这些改进,是因为之前的量级驱动算法正在以伪装方式进行杠杆或脊脊裂分数取样。 通过这种认识, 我们能够利用数字直线代数的大量工作来为这些问题获得比现有方法更简单和更快的算法。 我们还考虑为上述问题建立静态数据结构,并为这些问题获得接近至最理想的界限; 为了做到这一点,我们引入了一个新的随机化变换, 高斯随机化的哈达马德变换(GRHT ) 。 在数字直线性平面平面平面计算中, 找到一个最高线性直线级的算法子序列、 回归、 低级向下向下向下直径直径的直径直径直值数据结构、 直径直径直的直径直径直的直径直径直径直的直的直路路径直径直径直径直径直的直的直的算值, 和直的直的直径直径直的直的直的直的直的直的直的直的直的直的直的直的算算。