The performance of large-scale computing systems often critically depends on high-performance communication networks. Dynamically reconfigurable topologies, e.g., based on optical circuit switches, are emerging as an innovative new technology to deal with the explosive growth of datacenter traffic. Specifically, \emph{periodic} reconfigurable datacenter networks (RDCNs) such as RotorNet (SIGCOMM 2017), Opera (NSDI 2020) and Sirius (SIGCOMM 2020) have been shown to provide high throughput, by emulating a \emph{complete graph} through fast periodic circuit switch scheduling. However, to achieve such a high throughput, existing reconfigurable network designs pay a high price: in terms of potentially high delays, but also, as we show as a first contribution in this paper, in terms of the high buffer requirements. In particular, we show that under buffer constraints, emulating the high-throughput complete graph is infeasible at scale, and we uncover a spectrum of unvisited and attractive alternative RDCNs, which emulate regular graphs, but with lower node degree than the complete graph. We present Mars, a periodic reconfigurable topology which emulates a $d$-regular graph with near-optimal throughput. In particular, we systematically analyze how the degree~$d$ can be optimized for throughput given the available buffer and delay tolerance of the datacenter. We further show empirically that Mars achieves higher throughput compared to existing systems when buffer sizes are bounded.
翻译:大规模计算系统的性能往往关键地取决于高性能通信网络。动态的可调整表层,例如以光电路开关为基础,正在作为一种创新的新技术出现,以应对数据中心交通的爆炸性增长。具体地说,可重新配置的数据中心网络(RDCN),如RotorNet(SIGCOMM 2017)、Opera(NSDI 2020)和Sirius(SIGCOMM 2020),显示它提供了高度的透度,通过快速定期电路开关列表模拟一个不见和有吸引力的替代RDCN的图。然而,为了实现这样一个高度的透度,现有的可重新配置网络的设计将付出很高的代价:从潜在的高度延迟来看,但正如我们在本文中作为首个贡献,从高缓冲要求来看。我们表明,在缓冲限制下,模拟高通量的完整图表在规模上是无法做到的,我们发现一个不见性和有吸引力的替代的RDCN的频谱,与接近定期的图像相比,我们通过一个不甚易变的图表,我们通过一个最常态的直观的直观的直观的直观的直观,而我们通过正的直观的直观的直观的直观的直观的直观的直观的直观的直观的直观的直观的直观的直观的直观,通过一个直观,通过一个直观的直观的直观的直观的直观的直观的直观的正。