美元( 3 +\ varepsilon) 美元- 近似关联集成的平行算法 (A Parallel Algorithm for $(3 + \varepsilon)$-Approximate Correlation Clustering) - 专知论文

会员服务 ·

0

相关系数 · Pivotal（公司） · GROUP · 图 · 相似度 ·

2022 年 10 月 24 日

A Parallel Algorithm for $(3 + \varepsilon)$-Approximate Correlation Clustering

翻译：美元( 3 +\ varepsilon) 美元- 近似关联集成的平行算法

Mélanie Cambus,Shreyas Pai,Jara Uitto

Grouping together similar elements in datasets is a common task in data mining and machine learning. In this paper, we study parallel algorithms for correlation clustering, where each pair of items is labeled either similar or dissimilar. The task is to partition the elements and the objective is to minimize disagreements, that is, the number of dissimilar elements grouped together and similar elements grouped separately. Our main contribution is a parallel algorithm that achieves a $(3 + \varepsilon)$-approximation to the minimum number of disagreements. Our algorithm builds on the analysis of the PIVOT algorithm by Ailon, Charikar, and Newman [JACM'08] that obtains a $3$-approximation in the centralized setting. Our design allows us to sparsify the input graph by ignoring a large portion of the nodes and edges without a large extra cost as compared to the analysis of PIVOT. This sparsification makes our technique applicable on several models of massive graph processing, such as Massively Parallel Computing (MPC) and graph streaming, where sparse graphs can typically be handled much more efficiently. For linear memory models, such as the linear memory MPC and streaming, our approach yields $O(1)$ time algorithms, where the runtime is independent of $\varepsilon$, which only appears in the memory demand.

翻译：将数据集中的类似元素组合为一组是数据挖掘和机器学习的一项共同任务。在本文中, 我们研究相关组群的平行算法, 每组项目都有相似或不同标签。任务在于将元素分开, 目标是尽量减少分歧, 即将不同元素的数量分组为一组, 并将相似元素分组为一组。我们的主要贡献是平行算法, 实现$( 3 +\ varepsilon) $- 匹配到最小的分歧数量。我们的算法基于 Ailon、 Charikar 和 Newman [JACM'08] 对 PIVOT 算法的分析, 该算法在集中环境中获得3美元对应的值。我们的设计使我们能够通过忽略大部分节点和边缘的组合和类似元素的组合组合, 而没有比 PIVOT 分析产生巨大的额外成本。这种推算法使我们的技术只适用于几种大规模图形处理模型, 例如 Massalition 平行计算(MPC) 和图表流流, 其中稀有原始的图表, 也就是 MAC 的存储流流流。

0

相关内容

相关系数

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

介孔材料受限空间中的AGET ATRP和ARGET ATRP聚合反应

国家自然科学基金

0+阅读 · 2016年12月31日

高维积分波动率矩阵的估计及其在资产投资中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

带时滞随机动力系统不变流形的光滑性

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

下一代测序数据自适应错误修正技术的研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

通过grafting from技术修饰纳米粒子的计算机模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

微波焙烧含锗氧化锌烟尘回收锗过程的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于MARVELD1调控的miRNA对组蛋白H4甲基化修饰/染色质重塑的影响及其与细胞增殖关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

Arxiv

0+阅读 · 2022年12月8日

Runtime Analysis for the NSGA-II: Proving, Quantifying, and Explaining the Inefficiency For Many Objectives

Arxiv

0+阅读 · 2022年12月8日

A parallelizable model-based approach for marginal and multivariate clustering

Arxiv

0+阅读 · 2022年12月7日

BiPMAP: A Toolbox for Predictions of Perceived Motion Artifacts on Modern Displays

Arxiv

0+阅读 · 2022年12月7日

Approximating Subset Sum Ratio via Partition Computations

Arxiv

0+阅读 · 2022年12月7日

On Consistency for Bulk-Bitwise Processing-in-Memory

Arxiv

0+阅读 · 2022年12月7日

Stars: Tera-Scale Graph Building for Clustering and Graph Learning

Arxiv

0+阅读 · 2022年12月5日

A novel convergence enhancement method based on Online Dimension Reduction Optimization

Arxiv

0+阅读 · 2022年11月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

VIP会员

文章信息

相关主题

Pivotal（公司）

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

Arxiv

0+阅读 · 2022年12月8日

Runtime Analysis for the NSGA-II: Proving, Quantifying, and Explaining the Inefficiency For Many Objectives

Arxiv

0+阅读 · 2022年12月8日

A parallelizable model-based approach for marginal and multivariate clustering

Arxiv

0+阅读 · 2022年12月7日

BiPMAP: A Toolbox for Predictions of Perceived Motion Artifacts on Modern Displays

Arxiv

0+阅读 · 2022年12月7日

Approximating Subset Sum Ratio via Partition Computations

Arxiv

0+阅读 · 2022年12月7日

On Consistency for Bulk-Bitwise Processing-in-Memory

Arxiv

0+阅读 · 2022年12月7日

Stars: Tera-Scale Graph Building for Clustering and Graph Learning

Arxiv

0+阅读 · 2022年12月5日

A novel convergence enhancement method based on Online Dimension Reduction Optimization

Arxiv

0+阅读 · 2022年11月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

相关基金

介孔材料受限空间中的AGET ATRP和ARGET ATRP聚合反应

国家自然科学基金

0+阅读 · 2016年12月31日

高维积分波动率矩阵的估计及其在资产投资中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

带时滞随机动力系统不变流形的光滑性

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

下一代测序数据自适应错误修正技术的研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

通过grafting from技术修饰纳米粒子的计算机模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

微波焙烧含锗氧化锌烟尘回收锗过程的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于MARVELD1调控的miRNA对组蛋白H4甲基化修饰/染色质重塑的影响及其与细胞增殖关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员