Coupling regular topologies with optimised routing algorithms is key in pushing the performance of interconnection networks of supercomputers.In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalised Fat-Trees (PGFTs) which minimises congestion risk even under massive network degradation caused by equipment failure.Dmodc computes forwarding tables with a closed-form arithmetic formula by relying on a fast preprocessing phase.This allows complete re-routing of networks with tens of thousands of nodes in less than a second.In turn, this greatly helps centralised fabric management react to faults with high-quality routing tables and no impact to running applications in current and future very large-scale HPC clusters.
翻译:常规地形与优化路由算法相结合,是推动超级计算机互联网络运行的关键。 在本文中,我们介绍Dmodc, 即平行通用胖子(PGFTs)快速确定路径算法, 即使在设备故障造成大规模网络退化的情况下, 也最大限度地减少拥堵风险。 Dmodc通过依赖快速预处理阶段, 以封闭式算术计算公式计算转发表格。 这样就能在不到一秒的时间里完全改变有数万个节点的网络的路线。 反过来, 这大大有助于集中化的结构管理对高质量路由表的故障作出反应, 对当前和今后大规模HPC集群的应用没有影响。