Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning. Consequently, there has been significant work on efficiently approximating matrix multiplies. We introduce a learning-based algorithm for this task that greatly outperforms existing methods. Experiments using hundreds of matrices from diverse domains show that it often runs $100\times$ faster than exact matrix products and $10\times$ faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds. These results suggest that a mixture of hashing, averaging, and byte shuffling$-$the core operations of our method$-$could be a more promising building block for machine learning than the sparsified, factorized, and/or scalar quantized matrix products that have recently been the focus of substantial research and hardware investment.
翻译:乘式矩阵是机器学习中最基本和计算最密集的操作之一。 因此,在高效接近矩阵倍增方面做了大量工作。 我们为此任务引入了基于学习的算法,该算法大大优于现有方法。 使用来自不同领域的数百个矩阵的实验表明,其运行速度往往比精确的矩阵产品快100美元,比当前近似方法快10美元。 在通常情况下,一个矩阵提前为人所知,我们的方法也具有要求零倍增的有趣属性。 这些结果表明,我们方法中美元的核心操作的散化、平均和以美元冲洗混合,可能比最近成为大量研究和硬件投资重点的集成、计数化和(或)刻度四分制矩阵产品更有希望成为机器学习的建筑块。