COAX: 与软功能依赖性多维数据相关联软件索引 (COAX: Correlation-Aware Indexing on Multidimensional Data with Soft Functional Dependencies) - 专知论文

会员服务 ·

0

可约的 · 学成 · 相关系数 · SOFT · 泛函 ·

2021 年 1 月 15 日

COAX: Correlation-Aware Indexing on Multidimensional Data with Soft Functional Dependencies

翻译：COAX: 与软功能依赖性多维数据相关联软件索引

Ali Hadian,Behzad Ghaffari,Taiyi Wang,Thomas Heinis

Recent work proposed learned index structures, which learn the distribution of the underlying dataset to improve performance. The initial work on learned indexes has shown that by learning the cumulative distribution function of the data, index structures such as the B-Tree can improve their performance by one order of magnitude while having a smaller memory footprint. In this paper, we present COAX, a learned index for multidimensional data that, instead of learning the distribution of keys, learns the correlations between attributes of the dataset. Our approach is driven by the observation that in many datasets, values of two (or multiple) attributes are correlated. COAX exploits these correlations to reduce the dimensionality of the datasets. More precisely, we learn how to infer one (or multiple) attribute $C_d$ from the remaining attributes and hence no longer need to index attribute $C_d$. This reduces the dimensionality and hence makes the index smaller and more efficient. We theoretically investigate the effectiveness of the proposed technique based on the predictability of the FD attributes. We further show experimentally that by predicting correlated attributes in the data, we can improve the query execution time and reduce the memory overhead of the index. In our experiments, we reduce the execution time by 25% while reducing the memory footprint of the index by four orders of magnitude.

翻译：最近的工作建议了学习的指数结构,它学习了基础数据集的分布来改进性能。关于学习过的指数的初步工作表明,通过学习数据累积分布功能,B-Tree等指数结构可以提高一个数量级的性能,同时减少记忆足迹。在本文中,我们提出一个多维数据学的指数COAX,它不是学习键的分布,而是学习数据集属性之间的关联。我们的方法受到以下观察的驱动:在许多数据集中,两个(或多个)属性的值是相互关联的。COAX利用这些关联来减少数据集的维度。更准确地说,我们学会如何从剩余属性中推算一个(或多个)美元,将其属性归为$C_d$,从而不再需要指数归为$C_d$。这降低了维度,因此使得指数更小、更有效率。我们理论上根据FD属性的可预测性来调查拟议技术的有效性。我们进一步实验通过预测数据中的关联性属性来减少数据集的维度。我们学会如何从其余属性中推算出一个(或多个)将一个(或多个)指数归为$$_ddd),我们可以通过实验来减少执行时间的缩缩缩缩缩缩缩。

0

相关内容

可约的

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

专知会员服务

10+阅读 · 2020年1月7日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Efficient Pairwise Neuroimage Analysis using the Soft Jaccard Index and 3D Keypoint Sets

Arxiv

0+阅读 · 2021年3月11日

The Curse of Correlations for Robust Fingerprinting of Relational Databases

Arxiv

0+阅读 · 2021年3月11日

Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational Inference

Arxiv

0+阅读 · 2021年3月10日

The distribution of Yule's "nonsense correlation" of Gaussian random walks

Arxiv

0+阅读 · 2021年3月10日

Universality Laws for High-Dimensional Learning with Random Features

Arxiv

0+阅读 · 2021年3月10日

TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion

TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion

Arxiv

9+阅读 · 2020年10月7日

Relation-aware Graph Attention Network for Visual Question Answering

Arxiv

4+阅读 · 2019年3月29日

Attributed Network Embedding for Incomplete Structure Information

Attributed Network Embedding for Incomplete Structure Information

Arxiv

3+阅读 · 2018年11月28日

Premise selection with neural networks and distributed representation of features

Arxiv

3+阅读 · 2018年7月26日

Complex Relations in a Deep Structured Prediction Model for Fine Image Segmentation

Arxiv

7+阅读 · 2018年5月24日

VIP会员

文章信息

相关主题

相关VIP内容

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

专知会员服务

10+阅读 · 2020年1月7日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Efficient Pairwise Neuroimage Analysis using the Soft Jaccard Index and 3D Keypoint Sets

Arxiv

0+阅读 · 2021年3月11日

The Curse of Correlations for Robust Fingerprinting of Relational Databases

Arxiv

0+阅读 · 2021年3月11日

Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational Inference

Arxiv

0+阅读 · 2021年3月10日

The distribution of Yule's "nonsense correlation" of Gaussian random walks

Arxiv

0+阅读 · 2021年3月10日

Universality Laws for High-Dimensional Learning with Random Features

Arxiv

0+阅读 · 2021年3月10日

TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion

TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion

Arxiv

9+阅读 · 2020年10月7日

Relation-aware Graph Attention Network for Visual Question Answering

Arxiv

4+阅读 · 2019年3月29日

Attributed Network Embedding for Incomplete Structure Information

Attributed Network Embedding for Incomplete Structure Information

Arxiv

3+阅读 · 2018年11月28日

Premise selection with neural networks and distributed representation of features

Arxiv

3+阅读 · 2018年7月26日

Complex Relations in a Deep Structured Prediction Model for Fine Image Segmentation

Arxiv

7+阅读 · 2018年5月24日

微信扫码咨询专知VIP会员