通过非参数密度峰峰群群集对高维数据集进行自动地形测量 (Automatic topography of high-dimensional data sets by non-parametric Density Peak clustering) - 专知论文

会员服务 ·

0

Extensibility · 簇 · 估计/估计量 · 复合数据 · INFORMS ·

2021 年 2 月 5 日

Automatic topography of high-dimensional data sets by non-parametric Density Peak clustering

翻译：通过非参数密度峰峰群群集对高维数据集进行自动地形测量

Maria d'Errico,Elena Facco,Alessandro Laio,Alex Rodriguez

from arxiv, There is a Supplementary Information document in the ancillary files folder

Data analysis in high-dimensional spaces aims at obtaining a synthetic description of a data set, revealing its main structure and its salient features. We here introduce an approach providing this description in the form of a topography of the data, namely a human-readable chart of the probability density from which the data are harvested. The approach is based on an unsupervised extension of Density Peak clustering and a non-parametric density estimator that measures the probability density in the manifold containing the data. This allows finding automatically the number and the height of the peaks of the probability density, and the depth of the "valleys" separating them. Importantly, the density estimator provides a measure of the error, which allows distinguishing genuine density peaks from density fluctuations due to finite sampling. The approach thus provides robust and visual information about the density peaks' height, their statistical reliability, and their hierarchical organization, offering a conceptually powerful extension of the standard clustering partitions. We show that this framework is particularly useful in the analysis of complex data sets.

翻译：高维空间的数据分析旨在获得对数据集的合成描述,揭示其主结构和突出特征。我们在此采用一种方法,以数据地形的形式提供这种描述,即用于采集数据的概率密度的人类可读图表,该方法基于一个不受监督的Density Peak群集延伸,以及一个测量含有数据的方块的概率密度的非参数密度估计仪。这可以自动找到概率密度峰值的数量和高度,以及将其分离的“valleys”深度。重要的是,密度估计器提供了一种误差的量度,可以区分因抽样有限而导致的密度波动的真正密度峰值和密度峰值。因此,该方法提供了有关密度峰值、其统计可靠性及其等级结构的可靠和直观信息,提供了标准组群分区在概念上强大的扩展。我们表明,这个框架在复杂数据集的分析中特别有用。

0

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

应用机器学习书稿，361页pdf

应用机器学习书稿，361页pdf

专知会员服务

59+阅读 · 2020年11月24日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Nonparametric estimations and the diffeological Fisher metric

Arxiv

0+阅读 · 2021年3月30日

A General Framework of Nonparametric Feature Selection in High-Dimensional Data

Arxiv

0+阅读 · 2021年3月30日

A Tensor-EM Method for Large-Scale Latent Class Analysis with Clustering Consistency

Arxiv

0+阅读 · 2021年3月30日

Automatic Clustering in Hyrise

Arxiv

0+阅读 · 2021年3月29日

Sparse and Smooth Functional Data Clustering

Arxiv

0+阅读 · 2021年3月28日

SDCOR: Scalable Density-based Clustering for Local Outlier Detection in Massive-Scale Datasets

Arxiv

0+阅读 · 2021年3月27日

Smooth Online Parameter Estimation for time varying VAR models with application to rat's LFP data

Arxiv

0+阅读 · 2021年3月26日

Investigating spatial scan statistics for multivariate functional data

Arxiv

0+阅读 · 2021年3月26日

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

Arxiv

9+阅读 · 2020年6月15日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

应用机器学习书稿，361页pdf

应用机器学习书稿，361页pdf

专知会员服务

59+阅读 · 2020年11月24日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】多目标奖励与偏好优化：理论与算法

《无形的防御者？将定向能武器集成到反无人机框架的机遇与挑战》报告

自主化海军：海上无人系统与未来海战

迈向智能体系统规模化的科学

相关资讯

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Nonparametric estimations and the diffeological Fisher metric

Arxiv

0+阅读 · 2021年3月30日

A General Framework of Nonparametric Feature Selection in High-Dimensional Data

Arxiv

0+阅读 · 2021年3月30日

A Tensor-EM Method for Large-Scale Latent Class Analysis with Clustering Consistency

Arxiv

0+阅读 · 2021年3月30日

Automatic Clustering in Hyrise

Arxiv

0+阅读 · 2021年3月29日

Sparse and Smooth Functional Data Clustering

Arxiv

0+阅读 · 2021年3月28日

SDCOR: Scalable Density-based Clustering for Local Outlier Detection in Massive-Scale Datasets

Arxiv

0+阅读 · 2021年3月27日

Smooth Online Parameter Estimation for time varying VAR models with application to rat's LFP data

Arxiv

0+阅读 · 2021年3月26日

Investigating spatial scan statistics for multivariate functional data

Arxiv

0+阅读 · 2021年3月26日

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

Arxiv

9+阅读 · 2020年6月15日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员