Honeycomb: 基于 FPGA 的 SmartNIC 上有序键值存储加速 (Honeycomb: ordered key-value store acceleration on an FPGA-based SmartNIC) - 专知论文

会员服务 ·

0

键值存储 · 有序 · FPGA · 存储 · 负载 ·

2023 年 4 月 6 日

Honeycomb: ordered key-value store acceleration on an FPGA-based SmartNIC

翻译：Honeycomb: 基于 FPGA 的 SmartNIC 上有序键值存储加速

Junyi Liu,Aleksandar Dragojevic,Shane Flemming,Antonios Katsarakis,Dario Korolija,Igor Zablotchi,Ho-cheung Ng,Anuj Kalia,Miguel Castro

In-memory ordered key-value stores are an important building block in modern distributed applications. We present Honeycomb, a hybrid software-hardware system for accelerating read-dominated workloads on ordered key-value stores that provides linearizability for all operations including scans. Honeycomb stores a B-Tree in host memory, and executes SCAN and GET on an FPGA-based SmartNIC, and PUT, UPDATE and DELETE on the CPU. This approach enables large stores and simplifies the FPGA implementation but raises the challenge of data access and synchronization across the slow PCIe bus. We describe how Honeycomb overcomes this challenge with careful data structure design, caching, request parallelism with out-of-order request execution, wait-free read operations, and batching synchronization between the CPU and the FPGA. For read-heavy YCSB workloads, Honeycomb improves the throughput of a state-of-the-art ordered key-value store by at least 1.8x. For scan-heavy workloads inspired by cloud storage, Honeycomb improves throughput by more than 2x. The cost-performance, which is more important for large-scale deployments, is improved by at least 1.5x on these workloads.

翻译：在现代分布式应用程序中，内存中的有序键值存储是一个重要的构建块。我们介绍了 Honeycomb，一种混合软件-硬件系统，用于加速有序键值存储上的读为主的工作负载。它为所有操作（包括扫描）提供线性可比性。Honeycomb 将 B 树存储在主机内存中，并在基于 FPGA 的 SmartNIC 上执行 SCAN 和 GET，同时在 CPU 上执行 PUT、UPDATE 和 DELETE。这种方法使大容量存储变得容易，并简化了 FPGA 实现，但也提高了通过缓慢的 PCIe 总线进行数据访问和同步的挑战。我们描述了 Honeycomb 如何通过谨慎的数据结构设计、缓存、请求并行性和乱序请求执行、无等待读操作、以及 CPU 和 FPGA 之间的批处理同步来克服这一挑战。对于读重量级的 YCSB 工作负载，Honeycomb 将最先进的有序键值存储的吞吐量提高了至少 1.8 倍。对于云存储启发的扫描重负载，Honeycomb 的吞吐量提高了 2 倍以上。在这些工作负载上，性价比（对于大规模部署更为重要）提高了至少 1.5 倍。

0

相关内容

键值存储

未来网络白皮书——SmartNIC/DPU技术白皮书

未来网络白皮书——SmartNIC/DPU技术白皮书

专知会员服务

81+阅读 · 2022年8月31日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

专知会员服务

22+阅读 · 2022年2月19日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

专知会员服务

107+阅读 · 2020年2月22日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

专知会员服务

19+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

手把手教你写 Dart ffi

手把手教你写 Dart ffi

阿里技术

0+阅读 · 2022年11月7日

不再让CPU和总线拖后腿：Exafunction让GPU跑的更快！

不再让CPU和总线拖后腿：Exafunction让GPU跑的更快！

机器之心

0+阅读 · 2022年10月7日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

人脸检测库：libfacedetection

人脸检测库：libfacedetection

Python程序员

15+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

高能效FPGA高层次综合研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于分层图的海量图数据并行编程方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

Hedgehog-Gli1-DNMTs轴调控胰腺炎癌转化的表观遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

降低多载波通信信号峰均功率比的LDPC码研究

国家自然科学基金

0+阅读 · 2012年12月31日

一氧化碳脱氢酶的功能模拟与二氧化碳小分子的活化

国家自然科学基金

0+阅读 · 2012年12月31日

片上网络虚拟化关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

石墨烯中自旋和类自旋自由度的调控

国家自然科学基金

1+阅读 · 2011年12月31日

面向多核处理器的硬软件协作Transactional Memory系统结构

国家自然科学基金

0+阅读 · 2008年12月31日

Text-to-SQL Error Correction with Language Models of Code

Arxiv

0+阅读 · 2023年5月22日

SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Arxiv

0+阅读 · 2023年5月21日

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning

Arxiv

0+阅读 · 2023年5月19日

Direct Visual Servoing Based on Discrete Orthogonal Moments

Arxiv

0+阅读 · 2023年5月18日

Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques

Arxiv

0+阅读 · 2023年5月17日

Collecting Channel State Information in Wi-Fi Access Points for IoT Forensics

Arxiv

0+阅读 · 2023年5月17日

Intelligent Computing: The Latest Advances, Challenges and Future

Arxiv

56+阅读 · 2022年11月21日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

Artificial Intelligence for the Metaverse: A Survey

Arxiv

31+阅读 · 2022年2月15日

How to represent part-whole hierarchies in a neural network

Arxiv

13+阅读 · 2021年2月25日

VIP会员

文章信息

相关主题

相关VIP内容

未来网络白皮书——SmartNIC/DPU技术白皮书

未来网络白皮书——SmartNIC/DPU技术白皮书

专知会员服务

81+阅读 · 2022年8月31日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

专知会员服务

22+阅读 · 2022年2月19日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

专知会员服务

107+阅读 · 2020年2月22日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

专知会员服务

19+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

手把手教你写 Dart ffi

手把手教你写 Dart ffi

阿里技术

0+阅读 · 2022年11月7日

不再让CPU和总线拖后腿：Exafunction让GPU跑的更快！

不再让CPU和总线拖后腿：Exafunction让GPU跑的更快！

机器之心

0+阅读 · 2022年10月7日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

人脸检测库：libfacedetection

人脸检测库：libfacedetection

Python程序员

15+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

相关论文

Text-to-SQL Error Correction with Language Models of Code

Arxiv

0+阅读 · 2023年5月22日

SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Arxiv

0+阅读 · 2023年5月21日

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning

Arxiv

0+阅读 · 2023年5月19日

Direct Visual Servoing Based on Discrete Orthogonal Moments

Arxiv

0+阅读 · 2023年5月18日

Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques

Arxiv

0+阅读 · 2023年5月17日

Collecting Channel State Information in Wi-Fi Access Points for IoT Forensics

Arxiv

0+阅读 · 2023年5月17日

Intelligent Computing: The Latest Advances, Challenges and Future

Arxiv

56+阅读 · 2022年11月21日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

Artificial Intelligence for the Metaverse: A Survey

Arxiv

31+阅读 · 2022年2月15日

How to represent part-whole hierarchies in a neural network

Arxiv

13+阅读 · 2021年2月25日

相关基金

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

高能效FPGA高层次综合研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于分层图的海量图数据并行编程方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

Hedgehog-Gli1-DNMTs轴调控胰腺炎癌转化的表观遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

降低多载波通信信号峰均功率比的LDPC码研究

国家自然科学基金

0+阅读 · 2012年12月31日

一氧化碳脱氢酶的功能模拟与二氧化碳小分子的活化

国家自然科学基金

0+阅读 · 2012年12月31日

片上网络虚拟化关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

石墨烯中自旋和类自旋自由度的调控

国家自然科学基金

1+阅读 · 2011年12月31日

面向多核处理器的硬软件协作Transactional Memory系统结构

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员