内核集束核心群集核心群集 (Coresets for Kernel Clustering) - 专知论文

会员服务 ·

0

核化 · 簇 · 可约的 · 可分离的 · Performer ·

2022 年 11 月 15 日

Coresets for Kernel Clustering

翻译：内核集束核心群集核心群集

Shaofeng H. -C. Jiang,Robert Krauthgamer,Jianing Lou,Yubo Zhang

We devise coresets for kernel $k$-Means with a general kernel, and use them to obtain new, more efficient, algorithms. Kernel $k$-Means has superior clustering capability compared to classical $k$-Means, particularly when clusters are non-linearly separable, but it also introduces significant computational challenges. We address this computational issue by constructing a coreset, which is a reduced dataset that accurately preserves the clustering costs. Our main result is a coreset for kernel $k$-Means that works for a general kernel and has size $\mathrm{poly}(k\epsilon^{-1})$. Our new coreset both generalizes and greatly improves all previous results; moreover, it can be constructed in time near-linear in $n$. This result immediately implies new algorithms for kernel $k$-Means, such as a $(1+\epsilon)$-approximation in time near-linear in $n$, and a streaming algorithm using space and update time $\mathrm{poly}(k \epsilon^{-1} \log n)$. We validate our coreset on various datasets with different kernels. Our coreset performs consistently well, achieving small errors while using very few points. We show that our coresets can speed up kernel $k$-Means++ (the kernelized version of the widely used $k$-Means++ algorithm), and we further use this faster kernel $k$-Means++ for spectral clustering. In both applications, we achieve up to 1000x speedup while the error is comparable to baselines that do not use coresets.

翻译：我们为内核 $k$-Means 设计核心套件,用普通内核来获取新的、更有效率的算法。 Kernel $k$-Means 与古典的美元-Means 相比,拥有更高的组群能力,特别是当集群不线性分离时,它也带来了巨大的计算挑战。我们通过构建一个核心套件来解决这个计算问题,这是一个减少的数据集,可以准确保存集群成本。我们的主要结果是为内核 $k$-Means 的核心套件,该套件为一般内核工作,并且有美元-美元(poly}(klepsol)-Means) 的大小。我们的新核心套件既能概括,又能大大改进所有以前的结果;此外,它也可以在时间-线性内建一个核心套件,比如美元-美元-美元-套件-套件的内值-内值-套件-内值-美元-内值-内值-内值-内值-内值-内值的内值-值-内值-内值-内值的内值-内值-内值-内存的值-值-值-内值-内值-内值-内值-内值-值-内算算算算算-值-值-值-值-内值-内,并用一个基值-内基值-内-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内值-内存-内存-内存-内存-内存-内存-内-内-内-内-内-内-内-内-内存-内-内-内-内-内-内-内-内-内存-内-内-内-内存-内存-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内-内存-内存-内-内存-内存-内存-内存-

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Heisenberg 群上的 k-平面变换

国家自然科学基金

0+阅读 · 2015年12月31日

数量性状遗传关联分析中的统计方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

牛卵巢卵泡颗粒细胞CART相互作用蛋白的研究

国家自然科学基金

0+阅读 · 2011年12月31日

CDC73基因异常对颌骨骨化纤维瘤发病的作用

国家自然科学基金

0+阅读 · 2011年12月31日

以EGFR为识别靶位多靶点联合克服NSCLC EGFR TKIs耐药的基因干预研究

国家自然科学基金

0+阅读 · 2011年12月31日

Navier-Stokes方程稳定化有限元方法后验误差估计

国家自然科学基金

0+阅读 · 2011年12月31日

抗菌肽Apidaecin抑制革兰氏阴性菌E.coli的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

水稻vdac基因家族表达产物及其生物学功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Hybrid Parametric Classes of Isotropic Covariance Functions for Spatial Random Fields

Arxiv

0+阅读 · 2023年1月13日

Fixed and adaptive landmark sets for finite pseudometric spaces

Arxiv

0+阅读 · 2023年1月13日

Data-Efficient Structured Pruning via Submodular Optimization

Arxiv

0+阅读 · 2023年1月12日

SemPPL: Predicting pseudo-labels for better contrastive representations

SemPPL: Predicting pseudo-labels for better contrastive representations

Arxiv

0+阅读 · 2023年1月12日

LB-SimTSC: An Efficient Similarity-Aware Graph Neural Network for Semi-Supervised Time Series Classification

Arxiv

0+阅读 · 2023年1月12日

Subgraph Neural Networks

Arxiv

27+阅读 · 2020年6月19日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Attributed Graph Clustering via Adaptive Graph Convolution

Arxiv

11+阅读 · 2019年6月4日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Hybrid Parametric Classes of Isotropic Covariance Functions for Spatial Random Fields

Arxiv

0+阅读 · 2023年1月13日

Fixed and adaptive landmark sets for finite pseudometric spaces

Arxiv

0+阅读 · 2023年1月13日

Data-Efficient Structured Pruning via Submodular Optimization

Arxiv

0+阅读 · 2023年1月12日

SemPPL: Predicting pseudo-labels for better contrastive representations

SemPPL: Predicting pseudo-labels for better contrastive representations

Arxiv

0+阅读 · 2023年1月12日

LB-SimTSC: An Efficient Similarity-Aware Graph Neural Network for Semi-Supervised Time Series Classification

Arxiv

0+阅读 · 2023年1月12日

Subgraph Neural Networks

Arxiv

27+阅读 · 2020年6月19日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Attributed Graph Clustering via Adaptive Graph Convolution

Arxiv

11+阅读 · 2019年6月4日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

相关基金

Heisenberg 群上的 k-平面变换

国家自然科学基金

0+阅读 · 2015年12月31日

数量性状遗传关联分析中的统计方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

牛卵巢卵泡颗粒细胞CART相互作用蛋白的研究

国家自然科学基金

0+阅读 · 2011年12月31日

CDC73基因异常对颌骨骨化纤维瘤发病的作用

国家自然科学基金

0+阅读 · 2011年12月31日

以EGFR为识别靶位多靶点联合克服NSCLC EGFR TKIs耐药的基因干预研究

国家自然科学基金

0+阅读 · 2011年12月31日

Navier-Stokes方程稳定化有限元方法后验误差估计

国家自然科学基金

0+阅读 · 2011年12月31日

抗菌肽Apidaecin抑制革兰氏阴性菌E.coli的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

水稻vdac基因家族表达产物及其生物学功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员