与 UMAP 组合:为什么和如何连接问题 (Clustering with UMAP: Why and How Connectivity Matters) - 专知论文

会员服务 ·

0

簇 · Better · Extensibility · 可约的 · 极小点 ·

2021 年 8 月 12 日

Clustering with UMAP: Why and How Connectivity Matters

翻译：与 UMAP 组合:为什么和如何连接问题

Ayush Dalmia,Suzanna Sia

from arxiv, 9 pages

Topology based dimensionality reduction methods such as t-SNE and UMAP have seen increasing success and popularity in high-dimensional data. These methods have strong mathematical foundations and are based on the intuition that the topology in low dimensions should be close to that of high dimensions. Given that the initial topological structure is a precursor to the success of the algorithm, this naturally raises the question: What makes a "good" topological structure for dimensionality reduction? %Insight into this will enable us to design better algorithms which take into account both local and global structure. In this paper which focuses on UMAP, we study the effects of node connectivity (k-Nearest Neighbors vs \textit{mutual} k-Nearest Neighbors) and relative neighborhood (Adjacent via Path Neighbors) on dimensionality reduction. We explore these concepts through extensive ablation studies on 4 standard image and text datasets; MNIST, FMNIST, 20NG, AG, reducing to 2 and 64 dimensions. Our findings indicate that a more refined notion of connectivity (\textit{mutual} k-Nearest Neighbors with minimum spanning tree) together with a flexible method of constructing the local neighborhood (Path Neighbors), can achieve a much better representation than default UMAP, as measured by downstream clustering performance.

翻译：T-SNE 和 UMAP 等基于地形的减少方法在高维数据中取得了越来越多的成功和受欢迎程度。这些方法具有很强的数学基础,并且基于以下直觉:低维的地形学应该接近高维。鉴于最初的地形结构是算法成功的一个先导,这自然提出了这样一个问题:“良好的”地形结构对于降低维度来说是什么作用?% 深入到这里将使我们能够设计出更好的算法,其中既考虑到当地结构,也考虑到全球结构。在这份以 UMAP为重点的文件中,我们研究了节点连接(k-earest Neighbors vs\ textit{mutual} k-nearest Neghbors)和相对邻里(通过路径相邻相邻相邻的距离)在维度降低方面的影响。我们通过对4个标准图像和文本数据集进行广泛的对比研究来探索这些概念;MNIST,FMNIST, 20NG, AG, 减为2和64个维度。我们的研究结果表明,一个更精细的连接概念是连接性概念(\\ negh),可以与最精确的直径直径直径直径直径直径直径,可以实现。

0

相关内容

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【AAAI2021】层次图胶囊网络

【AAAI2021】层次图胶囊网络

专知会员服务

84+阅读 · 2020年12月18日

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

专知会员服务

43+阅读 · 2020年7月19日

【WWW 2019】异质图注意力网络，Heterogeneous Graph Attention Network

【WWW 2019】异质图注意力网络，Heterogeneous Graph Attention Network

专知会员服务

75+阅读 · 2020年6月14日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

9+阅读 · 2018年12月19日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

二值多视角聚类：Binary Multi-View Clustering

二值多视角聚类：Binary Multi-View Clustering

我爱读PAMI

4+阅读 · 2018年6月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure

Arxiv

0+阅读 · 2021年10月11日

Statistical Learning using Sparse Deep Neural Networks in Empirical Risk Minimization

Arxiv

0+阅读 · 2021年10月10日

K-Splits: Improved K-Means Clustering Algorithm to Automatically Detect the Number of Clusters

Arxiv

0+阅读 · 2021年10月9日

Scalable Graph Neural Networks via Bidirectional Propagation

Arxiv

16+阅读 · 2020年10月29日

Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning

Arxiv

5+阅读 · 2020年9月15日

Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

Arxiv

4+阅读 · 2020年3月5日

Self-labelling via simultaneous clustering and representation learning

Self-labelling via simultaneous clustering and representation learning

Arxiv

4+阅读 · 2019年11月26日

Hierarchical Graph Convolutional Networks for Semi-supervised Node Classification

Arxiv

3+阅读 · 2019年6月10日

Attributed Graph Clustering via Adaptive Graph Convolution

Arxiv

11+阅读 · 2019年6月4日

Learning Role-based Graph Embeddings

Arxiv

3+阅读 · 2018年2月7日

VIP会员

文章信息

相关主题

相关VIP内容

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【AAAI2021】层次图胶囊网络

【AAAI2021】层次图胶囊网络

专知会员服务

84+阅读 · 2020年12月18日

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

专知会员服务

43+阅读 · 2020年7月19日

【WWW 2019】异质图注意力网络，Heterogeneous Graph Attention Network

【WWW 2019】异质图注意力网络，Heterogeneous Graph Attention Network

专知会员服务

75+阅读 · 2020年6月14日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《科研智能：人工智能赋能工业仿真研究报告（2025年）》

具身智能中的世界模型：全面综述

【NeurIPS2025】迈向开放世界的三维“物体性”学习

【博士论文】用于排序与扩散模型的安全、高效与鲁棒强化学习

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

9+阅读 · 2018年12月19日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

二值多视角聚类：Binary Multi-View Clustering

二值多视角聚类：Binary Multi-View Clustering

我爱读PAMI

4+阅读 · 2018年6月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure

Arxiv

0+阅读 · 2021年10月11日

Statistical Learning using Sparse Deep Neural Networks in Empirical Risk Minimization

Arxiv

0+阅读 · 2021年10月10日

K-Splits: Improved K-Means Clustering Algorithm to Automatically Detect the Number of Clusters

Arxiv

0+阅读 · 2021年10月9日

Scalable Graph Neural Networks via Bidirectional Propagation

Arxiv

16+阅读 · 2020年10月29日

Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning

Arxiv

5+阅读 · 2020年9月15日

Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

Arxiv

4+阅读 · 2020年3月5日

Self-labelling via simultaneous clustering and representation learning

Self-labelling via simultaneous clustering and representation learning

Arxiv

4+阅读 · 2019年11月26日

Hierarchical Graph Convolutional Networks for Semi-supervised Node Classification

Arxiv

3+阅读 · 2019年6月10日

Attributed Graph Clustering via Adaptive Graph Convolution

Arxiv

11+阅读 · 2019年6月4日

Learning Role-based Graph Embeddings

Arxiv

3+阅读 · 2018年2月7日

微信扫码咨询专知VIP会员