数据分离性新颖的内在数据测量 (A Novel Intrinsic Measure of Data Separability) - 专知论文

会员服务 ·

0

分离的 · Performer · state-of-the-art · MoDELS · 相互独立的 ·

2021 年 9 月 11 日

A Novel Intrinsic Measure of Data Separability

翻译：数据分离性新颖的内在数据测量

Shuyue Guan,Murray Loew

from arxiv, 16 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2005.13120

In machine learning, the performance of a classifier depends on both the classifier model and the separability/complexity of datasets. To quantitatively measure the separability of datasets, we create an intrinsic measure -- the Distance-based Separability Index (DSI), which is independent of the classifier model. We consider the situation in which different classes of data are mixed in the same distribution to be the most difficult for classifiers to separate. We then formally show that the DSI can indicate whether the distributions of datasets are identical for any dimensionality. And we verify the DSI to be an effective separability measure by comparing to several state-of-the-art separability/complexity measures using synthetic and real datasets. Having demonstrated the DSI's ability to compare distributions of samples, we also discuss some of its other promising applications, such as measuring the performance of generative adversarial networks (GANs) and evaluating the results of clustering methods.

翻译：在机器学习中,分类器的性能既取决于分类模型,也取决于数据集的分离性/复杂性。为了量化地测量数据集的分离性,我们创建了一个内在的计量标准 -- -- 独立于分类器模型的远程分离性指数(DSI),我们认为不同类别的数据混合在同一分布中的情况是分类器最难分离的。然后我们正式表明,DSI可以表明数据集的分布是否与任何维度相同。我们通过比较几种最先进的合成和真实数据集的分离性/兼容性计量标准,核实DSI是否是一种有效的分离性计量标准。我们展示了DSI比较样品分布的能力,我们还讨论了其其他一些有希望的应用,例如测量基因对抗网络的性能,评估组合方法的结果。

0

相关内容

分离的

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Out of distribution detection for skin and malaria images

Arxiv

0+阅读 · 2021年11月2日

Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

Arxiv

0+阅读 · 2021年11月1日

Smoke Testing for Machine Learning: Simple Tests to Discover Severe Defects

Arxiv

0+阅读 · 2021年10月29日

SA-SDR: A novel loss function for separation of meeting style data

Arxiv

0+阅读 · 2021年10月29日

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Arxiv

3+阅读 · 2021年3月5日

SepNE: Bringing Separability to Network Embedding

SepNE: Bringing Separability to Network Embedding

Arxiv

3+阅读 · 2019年2月26日

Active Metric Learning for Supervised Classification

Arxiv

9+阅读 · 2018年3月28日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Triplet-based Deep Similarity Learning for Person Re-Identification

Arxiv

3+阅读 · 2018年2月9日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

state-of-the-art

相互独立的

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Out of distribution detection for skin and malaria images

Arxiv

0+阅读 · 2021年11月2日

Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

Arxiv

0+阅读 · 2021年11月1日

Smoke Testing for Machine Learning: Simple Tests to Discover Severe Defects

Arxiv

0+阅读 · 2021年10月29日

SA-SDR: A novel loss function for separation of meeting style data

Arxiv

0+阅读 · 2021年10月29日

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Arxiv

3+阅读 · 2021年3月5日

SepNE: Bringing Separability to Network Embedding

SepNE: Bringing Separability to Network Embedding

Arxiv

3+阅读 · 2019年2月26日

Active Metric Learning for Supervised Classification

Arxiv

9+阅读 · 2018年3月28日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Triplet-based Deep Similarity Learning for Person Re-Identification

Arxiv

3+阅读 · 2018年2月9日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员