基于有条件生成模型的小组- By 查询的近似查询处理 (Approximate Query Processing for Group-By Queries based on Conditional Generative Models) - 专知论文

会员服务 ·

0

分层采样 · 模型评估 · 样本 · 估计/估计量 · 生成模型 ·

2021 年 1 月 8 日

Approximate Query Processing for Group-By Queries based on Conditional Generative Models

翻译：基于有条件生成模型的小组- By 查询的近似查询处理

Meifan Zhang,Hongzhi Wang

The Group-By query is an important kind of query, which is common and widely used in data warehouses, data analytics, and data visualization. Approximate query processing is an effective way to increase the querying efficiency on big data. The answer to a group-by query involves multiple values, which makes it difficult to provide sufficiently accurate estimations for all the groups. Stratified sampling improves the accuracy compared with the uniform sampling, but the samples chosen for some special queries cannot work for other queries. Online sampling chooses samples for the given query at query time, but it requires a long latency. Thus, it is a challenge to achieve both accuracy and efficiency at the same time. Facing such challenge, in this work, we propose a sample generation framework based on a conditional generative model. The sample generation framework can generate any number of samples for the given query without accessing the data. The proposed framework based on the lightweight model can be combined with stratified sampling and online aggregation to improve the estimation accuracy for group-by queries. The experimental results show that our proposed methods are both efficient and accurate.

翻译：组别查询是一种重要的查询类型,在数据仓库、数据分析和数据可视化中广泛使用,这是常见的。近似查询处理是提高大数据查询效率的有效方法。对组别查询的答案涉及多个数值,因此难以为所有组别提供足够准确的估计。分层抽样比统一抽样更能提高准确性,但为某些特殊查询选择的样本不能用于其他查询。在线抽样在查询时间选择给定查询的样本,但需要较长的延缓度。因此,实现准确性和效率是一项挑战。在这项工作中,面对这种挑战,我们提议以有条件的基因化模型为基础建立抽样生成框架。抽样生成框架可以在不访问数据的情况下为给定查询生成任何数量的样本。基于轻量模型的拟议框架可以与分层抽样和在线汇总相结合,以提高小组查询的估计准确性。实验结果显示,我们所提议的方法既有效又准确。

0

相关内容

分层采样

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

专知会员服务

138+阅读 · 2020年4月3日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

专知会员服务

40+阅读 · 2019年11月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Estimation and Inference for High Dimensional Generalized Linear Models: A Splitting and Smoothing Approach

Arxiv

1+阅读 · 2021年3月6日

On the Occasional Exactness of the Distributional Transform Approximation for Direct Gaussian Copula Models with Discrete Margins

Arxiv

0+阅读 · 2021年3月5日

$γ$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator

Arxiv

0+阅读 · 2021年3月5日

Diverse Critical Interaction Generation for Planning and Planner Evaluation

Arxiv

0+阅读 · 2021年3月5日

Gradient-Guided Dynamic Efficient Adversarial Training

Arxiv

0+阅读 · 2021年3月4日

Approximate Bayesian Conditional Copulas

Arxiv

0+阅读 · 2021年3月4日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

Generative Model for Heterogeneous Inference

Arxiv

4+阅读 · 2018年4月26日

Generative Stock Question Answering

Arxiv

6+阅读 · 2018年4月21日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

专知会员服务

138+阅读 · 2020年4月3日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

专知会员服务

40+阅读 · 2019年11月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Estimation and Inference for High Dimensional Generalized Linear Models: A Splitting and Smoothing Approach

Arxiv

1+阅读 · 2021年3月6日

On the Occasional Exactness of the Distributional Transform Approximation for Direct Gaussian Copula Models with Discrete Margins

Arxiv

0+阅读 · 2021年3月5日

$γ$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator

Arxiv

0+阅读 · 2021年3月5日

Diverse Critical Interaction Generation for Planning and Planner Evaluation

Arxiv

0+阅读 · 2021年3月5日

Gradient-Guided Dynamic Efficient Adversarial Training

Arxiv

0+阅读 · 2021年3月4日

Approximate Bayesian Conditional Copulas

Arxiv

0+阅读 · 2021年3月4日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

Generative Model for Heterogeneous Inference

Arxiv

4+阅读 · 2018年4月26日

Generative Stock Question Answering

Arxiv

6+阅读 · 2018年4月21日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员