Matrix multiplication optimization remains a fundamental challenge in computational mathematics. This work introduces a novel approach that discovers matrix multiplication schemes whose coefficients are restricted to the set $\{-1, 0, 1\}$ (denoted $Z_T$), minimizing naive additive complexity for efficient hardware implementation. The core of the method is a GPU-accelerated meta flip graph algorithm that maintains ternary safety through specialized arithmetic operations and sign symmetry breaking. Key results include new best ranks for the formats $4 \times 5 \times 12$, $5 \times 6 \times 10$, and $6 \times 7 \times 9$, the independent discovery of 32 schemes in $Z_T$ that match known optimal ranks (including 8 previously known only with rational coefficients), and 30 rank improvements in the binary field. The analysis of 164 known schemes shows that 92 admit a ternary-coefficient implementation, while 72 could not be found under this constraint, defining the current boundaries of the approach. All software, results, and discovered schemes are provided as open-source.
翻译:矩阵乘法优化仍然是计算数学中的基础性挑战。本研究提出了一种新颖方法,用于发现系数限制在集合 $\\{-1, 0, 1\\}$(记为 $Z_T$)内的矩阵乘法方案,以最小化朴素加法复杂度,从而实现高效的硬件实现。该方法的核心是一个GPU加速的元翻转图算法,通过专用算术运算和符号对称性破缺保持三元安全性。关键成果包括:针对格式 $4 \\times 5 \\times 12$、$5 \\times 6 \\times 10$ 和 $6 \\times 7 \\times 9$ 获得了新的最佳秩;独立发现了 $Z_T$ 中32个与已知最优秩匹配的方案(其中8个先前仅知具有有理系数);在二进制域中实现了30个秩的改进。对164个已知方案的分析表明,其中92个可采用三元系数实现,而72个在该约束下未能找到,这界定了当前方法的边界。所有软件、结果及发现方案均已作为开源资源提供。