High-performance computing (HPC) researchers have long envisioned scenarios where application workflows could be improved through the use of programmable processing elements embedded in the network fabric. Recently, vendors have introduced programmable Smart Network Interface Cards (SmartNICs) that enable computations to be offloaded to the edge of the network. There is great interest in both the HPC and high-performance data analytics communities in understanding the roles these devices may play in the data paths of upcoming systems. This paper focuses on characterizing both the networking and computing aspects of NVIDIA's new BlueField-2 SmartNIC when used in an Ethernet environment. For the networking evaluation we conducted multiple transfer experiments between processors located at the host, the SmartNIC, and a remote host. These tests illuminate how much processing headroom is available on the SmartNIC during transfers. For the computing evaluation we used the stress-ng benchmark to compare the BlueField-2 to other servers and place realistic bounds on the types of offload operations that are appropriate for the hardware. Our findings from this work indicate that while the BlueField-2 provides a flexible means of processing data at the network's edge, great care must be taken to not overwhelm the hardware. While the host can easily saturate the network link, the SmartNIC's embedded processors may not have enough computing resources to sustain more than half the expected bandwidth when using kernel-space packet processing. From a computational perspective, encryption operations, memory operations under contention, and on-card IPC operations on the SmartNIC perform significantly better than the general-purpose servers used for comparisons in our experiments. Therefore, applications that mainly focus on these operations may be good candidates for offloading to the SmartNIC.
翻译:高性能计算( HPC) 研究人员有长期设想的情景: 应用工作流程可以通过使用嵌入网络结构中的可编程处理元素来改进应用工作流程。 最近, 供应商引入了可编程的智能网络界面卡(SmartNICs), 使计算能够卸载到网络边缘。 高性能数据分析器(HPC) 研究人员对高性能计算机和高性能数据分析器社区都非常感兴趣, 了解这些设备在即将到来的系统的数据路径中可能发挥的作用。 本文的重点是描述NVIDIA的新蓝FOR-2 SmartNIC( Nuel Freederal Field Form- SmartNIC) 的网络网络和远程主机主机主机之间网络的网络化处理器(SmartNIC) 进行多次传输实验。 我们使用压力-ng基准来比较蓝性2和高性能数据分析器在即将到其他系统的数据路径中可能存在现实的连接。 我们的研究结果显示, 蓝性2- 服务器运行在运行中提供了一种灵活的方式, 在运行中, 在运行中, 运行中, 在运行中, 运行中, 运行中不能在智能内部内部内部内部网络的轨道中, 链接中,, 使用, 运行中, 运行中, 运行中, 运行中, 运行中可以大大 运行中可以持续, 运行中, 运行中, 运行中, 运行中, 运行中, 运行中 运行中, 运行中, 运行中, 运行中,, 运行中 运行中 运行中 运行中 运行中,,, 运行中 运行 运行中 运行中 运行中, 。