K - 学习混合模型工具的信息 (The Informativeness of K -Means for Learning Mixture Models) - 专知论文

会员服务 ·

0

簇 · Learning · INFORMS · 优化器 · MoDELS ·

2022 年 8 月 25 日

The Informativeness of K -Means for Learning Mixture Models

翻译：K - 学习混合模型工具的信息

Zhaoqiang Liu,Vincent Y. F. Tan

from arxiv, Accepted to IEEE Transactions on Information Theory

The learning of mixture models can be viewed as a clustering problem. Indeed, given data samples independently generated from a mixture of distributions, we often would like to find the {\it correct target clustering} of the samples according to which component distribution they were generated from. For a clustering problem, practitioners often choose to use the simple $k$-means algorithm. $k$-means attempts to find an {\it optimal clustering} that minimizes the sum-of-squares distance between each point and its cluster center. In this paper, we consider fundamental (i.e., information-theoretic) limits of the solutions (clusterings) obtained by optimizing the sum-of-squares distance. In particular, we provide sufficient conditions for the closeness of any optimal clustering and the correct target clustering assuming that the data samples are generated from a mixture of spherical Gaussian distributions. We also generalize our results to log-concave distributions. Moreover, we show that under similar or even weaker conditions on the mixture model, any optimal clustering for the samples with reduced dimensionality is also close to the correct target clustering. These results provide intuition for the informativeness of $k$-means (with and without dimensionality reduction) as an algorithm for learning mixture models.

翻译：混合模型的学习可被视为一个组群问题。事实上,考虑到由混合分布制独立产生的数据样本,我们常常希望找到样本中根据成分分布法生成的样本的 rit 正确目标群集。对于组群问题,执业者往往选择使用简单的美元平均算法。 $k$ 表示尝试找到一个 jit 最佳集聚, 以最大限度地减少每个点及其集聚中心之间的方差和方差之和。此外, 在本文中,我们考虑了通过优化等量和方差距离获得的解决方案(集束)的基本(即信息理论)限度。特别是,我们为任何最佳集聚和正确目标群提供了充分的条件,假设数据样品来自球形分布的混合。我们还将我们的结果概括为日志和集分布。此外,我们发现在类似或更弱的条件下,任何以较低维度为样本的最佳集束(集成)的模型的最佳集成量都接近于准确的基数值,而没有进行精确的基数的基数分析。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

互穿网络型离子液体修饰的高孔容金属-有机框架材料的构筑及捕集CO2机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型Re(I)配合物磷光材料的设计、合成及其光电性能研究

国家自然科学基金

1+阅读 · 2012年12月31日

新型半导体“量子点”材料：金属硫族超四面体簇的离散化和表面功能化

国家自然科学基金

0+阅读 · 2012年12月31日

具有微孔、介孔和大孔的多级孔MOF材料的设计合成与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于亚波长金属结构的有源粒子辐射光场调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

气溶胶新生粒子的成核和生长动力学原位谱学监测

国家自然科学基金

0+阅读 · 2012年12月31日

局域结构可控的Nd：AeF2（Ae=Ca，Sr，Ba）激光晶体的研究

国家自然科学基金

0+阅读 · 2011年12月31日

图像信号多空间特征建模与优化重建方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

金属巯基配合物的阴离子识别传感研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Fourier Approach to Mixture Learning

Arxiv

0+阅读 · 2022年10月6日

Reward-Mixing MDPs with a Few Latent Contexts are Learnable

Arxiv

0+阅读 · 2022年10月5日

A Systematic Survey on Deep Generative Models for Graph Generation

Arxiv

18+阅读 · 2022年10月4日

Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering

Arxiv

0+阅读 · 2022年10月2日

Pitfalls of Gaussians as a noise distribution in NCE

Arxiv

0+阅读 · 2022年10月1日

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Arxiv

0+阅读 · 2022年9月30日

Learning with MISELBO: The Mixture Cookbook

Arxiv

0+阅读 · 2022年9月30日

Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

Arxiv

0+阅读 · 2022年9月30日

Mixture of experts models for multilevel data: modelling framework and approximation theory

Arxiv

0+阅读 · 2022年9月30日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Fourier Approach to Mixture Learning

Arxiv

0+阅读 · 2022年10月6日

Reward-Mixing MDPs with a Few Latent Contexts are Learnable

Arxiv

0+阅读 · 2022年10月5日

A Systematic Survey on Deep Generative Models for Graph Generation

Arxiv

18+阅读 · 2022年10月4日

Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering

Arxiv

0+阅读 · 2022年10月2日

Pitfalls of Gaussians as a noise distribution in NCE

Arxiv

0+阅读 · 2022年10月1日

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Arxiv

0+阅读 · 2022年9月30日

Learning with MISELBO: The Mixture Cookbook

Arxiv

0+阅读 · 2022年9月30日

Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

Arxiv

0+阅读 · 2022年9月30日

Mixture of experts models for multilevel data: modelling framework and approximation theory

Arxiv

0+阅读 · 2022年9月30日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

互穿网络型离子液体修饰的高孔容金属-有机框架材料的构筑及捕集CO2机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型Re(I)配合物磷光材料的设计、合成及其光电性能研究

国家自然科学基金

1+阅读 · 2012年12月31日

新型半导体“量子点”材料：金属硫族超四面体簇的离散化和表面功能化

国家自然科学基金

0+阅读 · 2012年12月31日

具有微孔、介孔和大孔的多级孔MOF材料的设计合成与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于亚波长金属结构的有源粒子辐射光场调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

气溶胶新生粒子的成核和生长动力学原位谱学监测

国家自然科学基金

0+阅读 · 2012年12月31日

局域结构可控的Nd：AeF2（Ae=Ca，Sr，Ba）激光晶体的研究

国家自然科学基金

0+阅读 · 2011年12月31日

图像信号多空间特征建模与优化重建方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

金属巯基配合物的阴离子识别传感研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员