分布不定数据库中最上k 主导控点查询的概率性 (Probabilistic Top-k Dominating Queries in Distributed Uncertain Databases) - 专知论文

会员服务 ·

0

秩 · Extensibility · 提议分布 · MapReduce · Processing（编程语言） ·

2021 年 5 月 10 日

Probabilistic Top-k Dominating Queries in Distributed Uncertain Databases

翻译：分布不定数据库中最上k 主导控点查询的概率性

Niranjan Rai,Xiang Lian

In many real-world applications such as business planning and sensor data monitoring, one important, yet challenging, the task is to rank objects(e.g., products, documents, or spatial objects) based on their ranking scores and efficiently return those objects with the highest scores. In practice, due to the unreliability of data sources, many real-world objects often contain noises and are thus imprecise and uncertain. In this paper, we study the problem of probabilistic top-k dominating(PTD) query on such large-scale uncertain data in a distributed environment, which retrieves k uncertain objects from distributed uncertain databases(on multiple distributed servers), having the largest ranking scores with high confidences. In order to efficiently tackle the distributed PTD problem, we propose a MapReduce framework for processing distributed PTD queries over distributed uncertain databases. In this MapReduce framework, we design effective pruning strategies to filter out false alarms in the distributed setting, propose cost-model-based index distribution mechanisms over servers, and develop efficient distributed PTD query processing algorithms. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed distributed PTD approach on both real and synthetic data sets through various experimental settings.

翻译：在诸如商业规划和传感器数据监测等许多现实应用中,一个重要但又具有挑战性的重要现实应用领域,任务是根据排名评分对对象(如产品、文件或空间物体)进行排名,并有效地将得分最高的对象退回;实际上,由于数据来源不可靠,许多现实世界的物体往往含有噪音,因此是不准确和不确定的;在本文件中,我们研究分布环境中关于这种大规模不确定数据的概率性顶层支配(PTD)查询问题,从分布式不确定的数据库(多分布式服务器上)检索到最大得分的不确定对象,并具有高度信心;为了有效处理分布式PTD问题,我们提出了一个地图图解框架,用于处理分布式不确定的数据库中分布式PTD查询;在分布式数据库中,我们设计有效的调整战略,在分布式设置中过滤错误的警报,提出基于成本模型的服务器的指数分配机制,并开发高效分布式的PTD查询算法;为了有效处理分布式的PTD,通过各种实验和合成数据集,进行广泛的实验,我们拟议的分布式PTD方法的效率和有效性。

0

相关内容

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

专知会员服务

60+阅读 · 2020年7月6日

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

5+阅读 · 2017年10月20日

Revelio: ML-Generated Debugging Queries for Distributed Systems

Arxiv

0+阅读 · 2021年6月28日

Synthetic topology in Homotopy Type Theory for probabilistic programming

Arxiv

0+阅读 · 2021年6月25日

The Problem of Distributed Consensus: A Survey

Arxiv

0+阅读 · 2021年6月24日

Evaluation of deep lift pose models for 3D rodent pose estimation based on geometrically triangulated data

Evaluation of deep lift pose models for 3D rodent pose estimation based on geometrically triangulated data

Arxiv

0+阅读 · 2021年6月24日

Efficient Non-parametric Bayesian Hawkes Processes

Arxiv

0+阅读 · 2021年6月24日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Diverse Few-Shot Text Classification with Multiple Metrics

Arxiv

6+阅读 · 2018年5月19日

Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Arxiv

4+阅读 · 2018年4月10日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

Distributed Constraint Optimization Problems and Applications: A Survey

Arxiv

5+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

专知会员服务

60+阅读 · 2020年7月6日

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

5+阅读 · 2017年10月20日

相关论文

Revelio: ML-Generated Debugging Queries for Distributed Systems

Arxiv

0+阅读 · 2021年6月28日

Synthetic topology in Homotopy Type Theory for probabilistic programming

Arxiv

0+阅读 · 2021年6月25日

The Problem of Distributed Consensus: A Survey

Arxiv

0+阅读 · 2021年6月24日

Evaluation of deep lift pose models for 3D rodent pose estimation based on geometrically triangulated data

Evaluation of deep lift pose models for 3D rodent pose estimation based on geometrically triangulated data

Arxiv

0+阅读 · 2021年6月24日

Efficient Non-parametric Bayesian Hawkes Processes

Arxiv

0+阅读 · 2021年6月24日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Diverse Few-Shot Text Classification with Multiple Metrics

Arxiv

6+阅读 · 2018年5月19日

Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Arxiv

4+阅读 · 2018年4月10日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

Distributed Constraint Optimization Problems and Applications: A Survey

Arxiv

5+阅读 · 2018年1月11日

微信扫码咨询专知VIP会员