The slowing down of Moore's law has driven the development of unconventional computing paradigms, such as specialized Ising machines tailored to solve combinatorial optimization problems. In this paper, we show a new application domain for probabilistic bit (p-bit) based Ising machines by training deep generative AI models with them. Using sparse, asynchronous, and massively parallel Ising machines we train deep Boltzmann networks in a hybrid probabilistic-classical computing setup. We use the full MNIST dataset without any downsampling or reduction in hardware-aware network topologies implemented in moderately sized Field Programmable Gate Arrays (FPGA). Our machine, which uses only 4,264 nodes (p-bits) and about 30,000 parameters, achieves the same classification accuracy (90%) as an optimized software-based restricted Boltzmann Machine (RBM) with approximately 3.25 million parameters. Additionally, the sparse deep Boltzmann network can generate new handwritten digits, a task the 3.25 million parameter RBM fails at despite achieving the same accuracy. Our hybrid computer takes a measured 50 to 64 billion probabilistic flips per second, which is at least an order of magnitude faster than superficially similar Graphics and Tensor Processing Unit (GPU/TPU) based implementations. The massively parallel architecture can comfortably perform the contrastive divergence algorithm (CD-n) with up to n = 10 million sweeps per update, beyond the capabilities of existing software implementations. These results demonstrate the potential of using Ising machines for traditionally hard-to-train deep generative Boltzmann networks, with further possible improvement in nanodevice-based realizations.
翻译:摘要:摩尔定律的放缓推动了非常规计算范式的发展,例如专业的 Ising 机,针对求解组合优化问题进行了量身定制。本文展示了概率比特(p-bit)型 Ising 机的新应用领域,即使用它们来训练深度生成 AI 模型。使用稀疏、异步、大规模并行的 Ising 机,在混合概率 - 经典计算设置下,我们训练深度 Boltzmann 网络。我们在中等规模的现场可编程门阵列(FPGA)中实现硬件感知的网络拓扑,并使用完整的 MNIST 数据集,无需任何降采样或减小数据集。我们的机器只使用 4,264 个节点(p-bits)和大约 30,000 个参数,即可达到与优化的基于软件的约束 Boltzmann 机(RBM)相同的分类准确度(90%)。此外,稀疏深度 Boltzmann 网络可以生成新的手写数字,而 3,250,000 个参数的 RBM 却无法胜任,尽管实现了相同的准确度。我们的混合计算机每秒进行 500~640 亿次概率翻转的计算,至少比表面上类似的 GPU / TPU 实现快一个数量级。大规模并行架构可以轻松地执行对比散度算法(CD-n),每个更新可达到 n = 10,000,000 次扫描,超出现有软件实现的能力范围。这些结果展示了使用 Ising 机去训练传统上难以训练的深度生成 Boltzmann 网络的潜力,今后还有可能在基于纳米器件的实现中进一步改进。