Communication lower bounds have long been established for matrix multiplication algorithms. However, most methods of asymptotic analysis have either ignored the constant factors or not obtained the tightest possible values. Recent work has demonstrated that more careful analysis improves the best known constants for some classical matrix multiplication lower bounds and helps to identify more efficient algorithms that match the leading-order terms in the lower bounds exactly and improve practical performance. The main result of this work is the establishment of memory-independent communication lower bounds with tight constants for parallel matrix multiplication. Our constants improve on previous work in each of three cases that depend on the relative sizes of the aspect ratios of the matrices.
翻译:长期以来,已经为矩阵乘法确定了通信的较低界限,然而,大多数无药可救的分析方法要么忽视了常数因素,要么没有获得尽可能最接近的值;最近的工作表明,更仔细的分析改进了某些古典矩阵乘法乘法下限最已知的常数,有助于确定与下限主序条件完全匹配的更有效算法,并改进实际性能。这项工作的主要结果是建立了内存独立通信的较低界限,为平行的矩阵乘法设置了紧凑的常数。在取决于矩阵侧比相对大小的三种情况下,我们的常数都比以往的工作有所改进。