基于验证的数据跳过(技术报告) (Provenance-based Data Skipping (TechReport)) - 专知论文

会员服务 ·

0

ONCE · Performer · 设计 · 类别 · 数据库 ·

2021 年 5 月 27 日

Provenance-based Data Skipping (TechReport)

翻译：基于验证的数据跳过(技术报告)

Xing Niu,Ziyu Liu,Pengyuan Li,Boris Glavic

from arxiv, 20 pages, 14 figures

Database systems analyze queries to determine upfront which data is needed for answering them and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps. Our approach significantly improves performance for both disk-based and main-memory database systems.

翻译：数据库系统分析查询,以确定答复数据需要哪些数据,并使用索引和其他物理设计技术加快访问数据的速度。但是,对于重要的查询类别,例如HAVING和Sptok查询,不可能确定数据的相关性。为了克服这一限制,我们开发出源数据跳转(PBDS),这是一种新颖的方法,生成出处草图,以简明地编码查询相关数据。一旦采集出处草图,将用来加快随后的查询。PBDS可以利用索引和区图等物理设计文物。我们的方法大大改进了磁盘和主模数据库系统的性能。

0

相关内容

ONCE

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

6+阅读 · 2017年7月6日

Recommending best course of treatment based on similarities of prognostic markers

Arxiv

0+阅读 · 2021年7月19日

BRR: Preserving Privacy of Text Data Efficiently on Device

Arxiv

0+阅读 · 2021年7月16日

BBB-Voting: 1-out-of-k Blockchain-Based Boardroom Voting

BBB-Voting: 1-out-of-k Blockchain-Based Boardroom Voting

Arxiv

0+阅读 · 2021年7月15日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Semantics of Data Mining Services in Cloud Computing

Semantics of Data Mining Services in Cloud Computing

Arxiv

4+阅读 · 2018年10月5日

VIP会员

文章信息

相关主题

相关VIP内容

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向真实世界音视联合语音识别的可扩展框架

《通过仿真与开源数据提升战略决策：机遇与局限》最新报告

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

评估大语言模型在科学发现中的作用

相关资讯

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

6+阅读 · 2017年7月6日

相关论文

Recommending best course of treatment based on similarities of prognostic markers

Arxiv

0+阅读 · 2021年7月19日

BRR: Preserving Privacy of Text Data Efficiently on Device

Arxiv

0+阅读 · 2021年7月16日

BBB-Voting: 1-out-of-k Blockchain-Based Boardroom Voting

BBB-Voting: 1-out-of-k Blockchain-Based Boardroom Voting

Arxiv

0+阅读 · 2021年7月15日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Semantics of Data Mining Services in Cloud Computing

Semantics of Data Mining Services in Cloud Computing

Arxiv

4+阅读 · 2018年10月5日

微信扫码咨询专知VIP会员