HTAP工作量的实时LSM-Trees (Real-Time LSM-Trees for HTAP Workloads) - 专知论文

会员服务 ·

0

Storage · Engineering · prototype · Next · 粤港澳大湾区数字经济研究院 ·

2021 年 1 月 17 日

Real-Time LSM-Trees for HTAP Workloads

翻译：HTAP工作量的实时LSM-Trees

Hemant Saxena,Lukasz Golab,Stratos Idreos,Ihab F. Ilyas

Real-time data analytics systems such as SAP HANA, MemSQL, and IBM Wildfire employ hybrid data layouts, in which data are stored in different formats throughout their lifecycle. Recent data are stored in a row-oriented format to serve OLTP workloads and support high data rates, while older data are transformed to a column-oriented format for OLAP access patterns. We observe that a Log-Structured Merge (LSM) Tree is a natural fit for a lifecycle-aware storage engine due to its high write throughput and level-oriented structure, in which records propagate from one level to the next over time. To build a lifecycle-aware storage engine using an LSM-Tree, we make a crucial modification to allow different data layouts in different levels, ranging from purely row-oriented to purely column-oriented, leading to a Real-Time LSM-Tree. We give a cost model and an algorithm to design a Real-Time LSM-Tree that is suitable for a given workload, followed by an experimental evaluation of LASER - a prototype implementation of our idea built on top of the RocksDB key-value store. In our evaluation, LASER is almost 5x faster than Postgres (a pure row-store) and two orders of magnitude faster than MonetDB (a pure column-store) for real-time data analytics workloads.

翻译：实时数据分析系统,如SAP HANNA、MemSQL和IBM Warifier等实时数据分析系统采用混合数据布局,其中数据在生命周期中以不同格式储存,数据在整个生命周期中以不同格式储存。最近的数据以面向行的格式储存,为OLTP工作量提供服务,支持高数据率,而旧数据则转换成以列为导向的OLAP访问模式格式。我们观察到,日志结构合并(LSM)树由于其高写量和级别结构,对寿命周期储存引擎是一种自然适应性,该结构将记录从一个层次传播到下一个层次。为了利用LSM-TRee建立一个生命周期记录存储引擎,我们做了一个至关重要的修改,允许不同层次的不同数据布局,从纯粹的面向行到纯粹的专栏访问模式,导致实时LSMM-Tree(LSM-Treere)树是一个成本模型和算法,它适合特定工作量,随后对LSER-SER的实验性记录进行试验性评价,这是我们SER-SRA-SER-一个比SDB最高级的SLAA级系统最高级的模型。

0

相关内容

Storage

Storage

【微众银行】联邦学习白皮书_v2.0，48页pdf，

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

169+阅读 · 2020年4月26日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

专知会员服务

23+阅读 · 2019年11月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Sensor selection for detecting deviations from a planned itinerary

Arxiv

0+阅读 · 2021年3月12日

Tracking Air Pollution in China: Near Real-Time PM2.5 Retrievals from Multiple Data Sources

Arxiv

0+阅读 · 2021年3月11日

Functional Collection Programming with Semi-Ring Dictionaries

Arxiv

0+阅读 · 2021年3月10日

Causal-aware Safe Policy Improvement for Task-oriented dialogue

Arxiv

0+阅读 · 2021年3月10日

Semantics-Empowered Communication for Networked Intelligent Systems

Arxiv

0+阅读 · 2021年3月10日

Analysing the Correlation of Geriatric Assessment Scores and Activity in Smart Homes

Arxiv

0+阅读 · 2021年3月10日

AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing

Arxiv

0+阅读 · 2021年3月10日

Real-time Scalable Dense Surfel Mapping

Real-time Scalable Dense Surfel Mapping

Arxiv

5+阅读 · 2019年9月10日

Improving Tree-LSTM with Tree Attention

Arxiv

4+阅读 · 2019年1月1日

Big Data: Understanding Big Data

Arxiv

6+阅读 · 2016年1月15日

VIP会员

文章信息

相关主题

粤港澳大湾区数字经济研究院

相关VIP内容

【微众银行】联邦学习白皮书_v2.0，48页pdf，

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

169+阅读 · 2020年4月26日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

专知会员服务

23+阅读 · 2019年11月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Sensor selection for detecting deviations from a planned itinerary

Arxiv

0+阅读 · 2021年3月12日

Tracking Air Pollution in China: Near Real-Time PM2.5 Retrievals from Multiple Data Sources

Arxiv

0+阅读 · 2021年3月11日

Functional Collection Programming with Semi-Ring Dictionaries

Arxiv

0+阅读 · 2021年3月10日

Causal-aware Safe Policy Improvement for Task-oriented dialogue

Arxiv

0+阅读 · 2021年3月10日

Semantics-Empowered Communication for Networked Intelligent Systems

Arxiv

0+阅读 · 2021年3月10日

Analysing the Correlation of Geriatric Assessment Scores and Activity in Smart Homes

Arxiv

0+阅读 · 2021年3月10日

AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing

Arxiv

0+阅读 · 2021年3月10日

Real-time Scalable Dense Surfel Mapping

Real-time Scalable Dense Surfel Mapping

Arxiv

5+阅读 · 2019年9月10日

Improving Tree-LSTM with Tree Attention

Arxiv

4+阅读 · 2019年1月1日

Big Data: Understanding Big Data

Arxiv

6+阅读 · 2016年1月15日

微信扫码咨询专知VIP会员