以指数为基础的平行结构图分组及其近似 (Parallel Index-Based Structural Graph Clustering and Its Approximation) - 专知论文

会员服务 ·

0

SCAN · 簇 · LSH · 近似 · 图 ·

2021 年 3 月 30 日

Parallel Index-Based Structural Graph Clustering and Its Approximation

翻译：以指数为基础的平行结构图分组及其近似

Tom Tseng,Laxman Dhulipala,Julian Shun

SCAN (Structural Clustering Algorithm for Networks) is a well-studied, widely used graph clustering algorithm. For large graphs, however, sequential SCAN variants are prohibitively slow, and parallel SCAN variants do not effectively share work among queries with different SCAN parameter settings. Since users of SCAN often explore many parameter settings to find good clusterings, it is worthwhile to precompute an index that speeds up queries. This paper presents a practical and provably efficient parallel index-based SCAN algorithm based on GS*-Index, a recent sequential algorithm. Our parallel algorithm improves upon the asymptotic work of the sequential algorithm by using integer sorting. It is also highly parallel, achieving logarithmic span (parallel time) for both index construction and clustering queries. Furthermore, we apply locality-sensitive hashing (LSH) to design a novel approximate SCAN algorithm and prove guarantees for its clustering behavior. We present an experimental evaluation of our algorithms on large real-world graphs. On a 48-core machine with two-way hyper-threading, our parallel index construction achieves 50--151$\times$ speedup over the construction of GS*-Index. In fact, even on a single thread, our index construction algorithm is faster than GS*-Index. Our parallel index query implementation achieves 5--32$\times$ speedup over GS*-Index queries across a range of SCAN parameter values, and our implementation is always faster than ppSCAN, a state-of-the-art parallel SCAN algorithm. Moreover, our experiments show that applying LSH results in faster index construction while maintaining good clustering quality.

翻译：由于 SCAN 用户通常会探索许多参数设置以找到良好的组合, 值得预先计算一个能加快查询速度的指数。本文展示了一种实用的和可以想象的高效的平行的基于指数的 SCAN 质量算法, 以GS*- Index 为基础, 最近的序列算法。对于大图表来说, 相继的 SCAN 变异器速度极慢, 令人望而生畏, 而平行的 SCAN 变异器无法在不同 SCAN 参数设置的查询中有效地共享工作。由于 SCAN 用户经常探索许多参数设置以寻找良好的组合, 因此值得预先计算一个能加快查询速度的指数。本文展示了一种基于 GS*- Index 的实用和快速的 SCAN 。在使用整整分排序的 Asyronticle 工作上, 我们的平行的 SCAN 度算法运行速度比S- holdal- descriction S.

0

相关内容

SCAN

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【KDD2020】复杂异构网络中的高阶聚类

专知会员服务

50+阅读 · 2020年8月27日

【KDD2020】现实世界超图的结构模式和生成模型，Structural Patterns and Generative Models of Real-world Hypergraphs

【KDD2020】现实世界超图的结构模式和生成模型，Structural Patterns and Generative Models of Real-world Hypergraphs

专知会员服务

37+阅读 · 2020年6月16日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【电子书】大数据挖掘，Mining of Massive Datasets，附513页PDF

【电子书】大数据挖掘，Mining of Massive Datasets，附513页PDF

专知会员服务

105+阅读 · 2020年3月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【KDD2020】复杂异构网络中的高阶聚类

【KDD2020】复杂异构网络中的高阶聚类

专知

8+阅读 · 2020年8月27日

已删除

将门创投

7+阅读 · 2020年3月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

二值多视角聚类：Binary Multi-View Clustering

二值多视角聚类：Binary Multi-View Clustering

我爱读PAMI

4+阅读 · 2018年6月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Density estimation: an inflation-deflation approach

Arxiv

0+阅读 · 2021年5月25日

Providing Meaningful Data Summarizations Using Examplar-based Clustering in Industry 4.0

Arxiv

0+阅读 · 2021年5月25日

On randomized trace estimates for indefinite matrices with an application to determinants

Arxiv

0+阅读 · 2021年5月25日

Estimation of time-varying characteristics of locally stationary functional time series

Arxiv

0+阅读 · 2021年5月25日

On the Approximation of Accuracy-configurable Sequential Multipliers via Segmented Carry Chains

Arxiv

0+阅读 · 2021年5月25日

Hashing embeddings of optimal dimension, with applications to linear least squares

Arxiv

0+阅读 · 2021年5月25日

PASOCS: A Parallel Approximate Solver for Probabilistic Logic Programs under the Credal Semantics

Arxiv

0+阅读 · 2021年5月23日

Oscillation Mitigation of Hyperbolicity-Preserving Intrusive Uncertainty Quantification Methods for Systems of Conservation Laws

Arxiv

0+阅读 · 2021年5月22日

On the Approximation Ratio of the 3-Opt Algorithm for the (1,2)-TSP

Arxiv

0+阅读 · 2021年5月21日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【KDD2020】复杂异构网络中的高阶聚类

专知会员服务

50+阅读 · 2020年8月27日

【KDD2020】现实世界超图的结构模式和生成模型，Structural Patterns and Generative Models of Real-world Hypergraphs

【KDD2020】现实世界超图的结构模式和生成模型，Structural Patterns and Generative Models of Real-world Hypergraphs

专知会员服务

37+阅读 · 2020年6月16日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【电子书】大数据挖掘，Mining of Massive Datasets，附513页PDF

【电子书】大数据挖掘，Mining of Massive Datasets，附513页PDF

专知会员服务

105+阅读 · 2020年3月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

【KDD2020】复杂异构网络中的高阶聚类

【KDD2020】复杂异构网络中的高阶聚类

专知

8+阅读 · 2020年8月27日

已删除

将门创投

7+阅读 · 2020年3月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

二值多视角聚类：Binary Multi-View Clustering

二值多视角聚类：Binary Multi-View Clustering

我爱读PAMI

4+阅读 · 2018年6月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Density estimation: an inflation-deflation approach

Arxiv

0+阅读 · 2021年5月25日

Providing Meaningful Data Summarizations Using Examplar-based Clustering in Industry 4.0

Arxiv

0+阅读 · 2021年5月25日

On randomized trace estimates for indefinite matrices with an application to determinants

Arxiv

0+阅读 · 2021年5月25日

Estimation of time-varying characteristics of locally stationary functional time series

Arxiv

0+阅读 · 2021年5月25日

On the Approximation of Accuracy-configurable Sequential Multipliers via Segmented Carry Chains

Arxiv

0+阅读 · 2021年5月25日

Hashing embeddings of optimal dimension, with applications to linear least squares

Arxiv

0+阅读 · 2021年5月25日

PASOCS: A Parallel Approximate Solver for Probabilistic Logic Programs under the Credal Semantics

Arxiv

0+阅读 · 2021年5月23日

Oscillation Mitigation of Hyperbolicity-Preserving Intrusive Uncertainty Quantification Methods for Systems of Conservation Laws

Arxiv

0+阅读 · 2021年5月22日

On the Approximation Ratio of the 3-Opt Algorithm for the (1,2)-TSP

Arxiv

0+阅读 · 2021年5月21日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

微信扫码咨询专知VIP会员