We present KRCORE, an RDMA library with a microsecond-scale control plane on commodity RDMA hardware for elastic computing. KRCORE can establish a full-fledged RDMA connection within 10{\mu}s (hundreds or thousands of times faster than verbs), while only maintaining a (small) fixed-sized connection metadata at each node, regardless of the cluster scale. The key ideas include virtualizing pre-initialized kernel-space RDMA connections instead of creating one from scratch, and retrofitting advanced RDMA dynamic connected transport with static transport for both low connection overhead and high networking speed. Under load spikes, KRCORE can shorten the worker bootstrap time of an existing disaggregated key-value store (namely RACE Hashing) by 83%. In serverless computing (namely Fn), KRCORE can also reduce the latency for transferring data through RDMA by 99%.
翻译:我们展示了KRCORE,这是一个RDMA图书馆,拥有用于弹性计算,用于商品RDMA硬件的微型二级控制平面。 KRCORE可以在10xmu}(比动词快100倍或数千倍)内建立一个完整的RDMA连接,而只是在每个节点维持一个(小型)固定的连接元数据,而不论其规模大小如何。关键的想法包括虚拟化预先初始化的内核空间RDMA连接,而不是从零开始创建,以及改造先进的RDMA动态传输与静态运输的RDMA动态连接,用于低连接间接费用和高联网速度。在装载峰值下,KRCORE可以将现有的分类关键价值商店(即RACE Hashing)的工人靴套时间缩短83%。在无服务器计算(即Fn)中,KRCORE也可以将通过RDMA传输数据的时的时段缩短99%。