高维数据异常探测的几何框架 (A geometric framework for outlier detection in high-dimensional data) - 专知论文

会员服务 ·

0

异常点 · 流形 · 流形学习 · Learning · 流形假设 ·

2022 年 7 月 29 日

A geometric framework for outlier detection in high-dimensional data

翻译：高维数据异常探测的几何框架

Moritz Herrmann,Florian Pfisterer,Fabian Scheipl

from arxiv, 24 page, 6 figures, extended introduction, contribution, and discussion sections, additional experiments added

Outlier or anomaly detection is an important task in data analysis. We discuss the problem from a geometrical perspective and provide a framework that exploits the metric structure of a data set. Our approach rests on the manifold assumption, i.e., that the observed, nominally high-dimensional data lie on a much lower dimensional manifold and that this intrinsic structure can be inferred with manifold learning methods. We show that exploiting this structure significantly improves the detection of outlying observations in high-dimensional data. We also suggest a novel, mathematically precise, and widely applicable distinction between distributional and structural outliers based on the geometry and topology of the data manifold that clarifies conceptual ambiguities prevalent throughout the literature. Our experiments focus on functional data as one class of structured high-dimensional data, but the framework we propose is completely general and we include image and graph data applications. Our results show that the outlier structure of high-dimensional and non-tabular data can be detected and visualized using manifold learning methods and quantified using standard outlier scoring methods applied to the manifold embedding vectors.

翻译：外观或异常是数据分析中的一项重要任务。我们从几何角度讨论这一问题,并提供一个利用数据集的计量结构的框架。我们的方法基于多方面的假设,即观测到的、名义上高维的数据位于一个低维的多维上,而这一内在结构可以用多种学习方法推断出来。我们表明,利用这一结构大大改进了在高维数据中测得外向观测的发现。我们还建议根据数据方的几何和地形学,对分布式和结构式外源进行新颖的、数学精确和广泛适用的区分,以澄清整个文献中普遍存在的概念模糊性。我们的实验侧重于功能性数据,作为结构性高维数据的一个类别,但我们提出的框架是完全一般性的,我们提出的框架包括图像和图表数据应用。我们的结果表明,高维和非表层数据的外源结构可以使用多种学习方法加以检测和可视化。我们的结果是,使用对多元嵌入矢量应用的标准外部评分法加以量化。

0

相关内容

异常点

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

0+阅读 · 2017年12月31日

齿梗孢霉产aurovertin类化合物的生物合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

近红外光动力癌症诊疗聚合物纳米粒子的合成与性能

国家自然科学基金

0+阅读 · 2012年12月31日

动态全光控表面等离子体新型光镊基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于贝壳表面微结构仿生的船舶防污减阻协同效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

立方（cubic)-TiB的合成、晶体结构与物理性能

国家自然科学基金

0+阅读 · 2011年12月31日

高压、超低功耗的易集成SOI功率器件机理与新结构研究

国家自然科学基金

0+阅读 · 2011年12月31日

低维纳米材料超结构的可控合成及其结构和性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

积分几何与凸几何分析

国家自然科学基金

2+阅读 · 2009年12月31日

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Arxiv

0+阅读 · 2022年9月27日

High-Dimensional Geometric Streaming in Polynomial Space

Arxiv

0+阅读 · 2022年9月27日

Variational Inference as Iterative Projection in a Bayesian Hilbert Space with Application to Robotic State Estimation

Variational Inference as Iterative Projection in a Bayesian Hilbert Space with Application to Robotic State Estimation

Arxiv

0+阅读 · 2022年9月26日

A unified framework for dataset shift diagnostics

Arxiv

0+阅读 · 2022年9月26日

involve-MI: Informative Planning with High-Dimensional Non-Parametric Beliefs

Arxiv

0+阅读 · 2022年9月23日

Oracle Analysis of Representations for Deep Open Set Detection

Arxiv

0+阅读 · 2022年9月22日

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Arxiv

16+阅读 · 2021年5月2日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Deep Anomaly Detection with Outlier Exposure

Deep Anomaly Detection with Outlier Exposure

Arxiv

17+阅读 · 2018年12月21日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Arxiv

0+阅读 · 2022年9月27日

High-Dimensional Geometric Streaming in Polynomial Space

Arxiv

0+阅读 · 2022年9月27日

Variational Inference as Iterative Projection in a Bayesian Hilbert Space with Application to Robotic State Estimation

Variational Inference as Iterative Projection in a Bayesian Hilbert Space with Application to Robotic State Estimation

Arxiv

0+阅读 · 2022年9月26日

A unified framework for dataset shift diagnostics

Arxiv

0+阅读 · 2022年9月26日

involve-MI: Informative Planning with High-Dimensional Non-Parametric Beliefs

Arxiv

0+阅读 · 2022年9月23日

Oracle Analysis of Representations for Deep Open Set Detection

Arxiv

0+阅读 · 2022年9月22日

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Arxiv

16+阅读 · 2021年5月2日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Deep Anomaly Detection with Outlier Exposure

Deep Anomaly Detection with Outlier Exposure

Arxiv

17+阅读 · 2018年12月21日

相关基金

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

0+阅读 · 2017年12月31日

齿梗孢霉产aurovertin类化合物的生物合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

近红外光动力癌症诊疗聚合物纳米粒子的合成与性能

国家自然科学基金

0+阅读 · 2012年12月31日

动态全光控表面等离子体新型光镊基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于贝壳表面微结构仿生的船舶防污减阻协同效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

立方（cubic)-TiB的合成、晶体结构与物理性能

国家自然科学基金

0+阅读 · 2011年12月31日

高压、超低功耗的易集成SOI功率器件机理与新结构研究

国家自然科学基金

0+阅读 · 2011年12月31日

低维纳米材料超结构的可控合成及其结构和性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

积分几何与凸几何分析

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员