降维作为概率推断的一种方法 (Dimensionality Reduction as Probabilistic Inference) - 专知论文

会员服务 ·

0

降维 · 推断 · 降维算法 · 概率 · 低维表示 ·

2023 年 4 月 15 日

Dimensionality Reduction as Probabilistic Inference

翻译：降维作为概率推断的一种方法

Aditya Ravuri,Francisco Vargas,Vidhi Lalchand,Neil D. Lawrence

from arxiv, Workshop version preprint

Dimensionality reduction (DR) algorithms compress high-dimensional data into a lower dimensional representation while preserving important features of the data. DR is a critical step in many analysis pipelines as it enables visualisation, noise reduction and efficient downstream processing of the data. In this work, we introduce the ProbDR variational framework, which interprets a wide range of classical DR algorithms as probabilistic inference algorithms in this framework. ProbDR encompasses PCA, CMDS, LLE, LE, MVU, diffusion maps, kPCA, Isomap, (t-)SNE, and UMAP. In our framework, a low-dimensional latent variable is used to construct a covariance, precision, or a graph Laplacian matrix, which can be used as part of a generative model for the data. Inference is done by optimizing an evidence lower bound. We demonstrate the internal consistency of our framework and show that it enables the use of probabilistic programming languages (PPLs) for DR. Additionally, we illustrate that the framework facilitates reasoning about unseen data and argue that our generative models approximate Gaussian processes (GPs) on manifolds. By providing a unified view of DR, our framework facilitates communication, reasoning about uncertainties, model composition, and extensions, particularly when domain knowledge is present.

翻译：降维算法是将高维数据压缩为低维表示的一种方法，同时保留数据的重要特征。降维在许多分析流程中是至关重要的，它使数据的展示、噪声减少和有效的下游处理成为可能。在本文中，我们介绍了ProbDR变分框架，将广泛的经典降维算法解释为该框架中的概率推断算法。ProbDR包括PCA、CMDS、LLE、LE、MVU、扩散映射、kPCA、Isomap、(t-)SNE和UMAP。在我们的框架中，使用低维潜在变量构建协方差、精度或图拉普拉斯矩阵，它们可用作数据的生成模型的一部分。通过优化证据下界进行推断。我们展示了我们框架的内部一致性，并证明它使得能够使用概率编程语言（PPLs）进行降维。此外，我们证明了该框架有利于推理出未见过的数据，并认为我们生成的模型近似于流形上的高斯过程（GPs）。通过提供降维的统一视图，我们的框架促进了沟通、推理不确定性、模型组合和扩展，特别是当领域知识存在时。

0

相关内容

降维是将数据从高维空间转换为低维空间，以便低维表示保留原始数据的某些有意义的属性，理想情况下接近其固有维。降维在处理大量观察和/或大量变量的领域很常见，例如信号处理，语音识别，神经信息学和生物信息学。

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

康奈尔大学「深度概率与生成模型」2021SP课程

专知会员服务

49+阅读 · 2021年4月24日

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

106+阅读 · 2021年2月27日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知

46+阅读 · 2022年11月5日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知

49+阅读 · 2020年8月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于广义半参数回归模型的统计推断及其应用研究

国家自然科学基金

2+阅读 · 2013年12月31日

统计学习理论中的分位数回归和MEE算法

国家自然科学基金

1+阅读 · 2012年12月31日

多元整数值GARCH模型的统计分析

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

一类半参数时间序列模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

基本群表示，调和度量的构造及其到上同调的应用

国家自然科学基金

1+阅读 · 2011年12月31日

复形范畴中的Gorenstein同调维数

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

Arxiv

0+阅读 · 2023年6月2日

Hiding Data Helps: On the Benefits of Masking for Sparse Coding

Arxiv

0+阅读 · 2023年6月1日

MixFlows: principled variational inference via mixed flows

Arxiv

0+阅读 · 2023年6月1日

On Tilted Losses in Machine Learning: Theory and Applications

Arxiv

0+阅读 · 2023年6月1日

Efficient and Robust Bayesian Selection of Hyperparameters in Dimension Reduction for Visualization

Arxiv

0+阅读 · 2023年6月1日

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

Arxiv

0+阅读 · 2023年6月1日

High-dimensional Variable Screening via Conditional Martingale Difference Divergence

Arxiv

0+阅读 · 2023年5月31日

Learning to solve Bayesian inverse problems: An amortized variational inference approach

Arxiv

0+阅读 · 2023年5月31日

Direct Diffusion Bridge using Data Consistency for Inverse Problems

Arxiv

0+阅读 · 2023年5月31日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

康奈尔大学「深度概率与生成模型」2021SP课程

专知会员服务

49+阅读 · 2021年4月24日

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

106+阅读 · 2021年2月27日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

热门VIP内容

开通专知VIP会员享更多权益服务

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

《军事行动中的人机协同共同学习》2025最新文献

代理式人工智能时代的决策优势

《F/A-18机队替换中队仿真模型的设计与分析》2025最新73页

相关资讯

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知

46+阅读 · 2022年11月5日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知

49+阅读 · 2020年8月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

相关论文

On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

Arxiv

0+阅读 · 2023年6月2日

Hiding Data Helps: On the Benefits of Masking for Sparse Coding

Arxiv

0+阅读 · 2023年6月1日

MixFlows: principled variational inference via mixed flows

Arxiv

0+阅读 · 2023年6月1日

On Tilted Losses in Machine Learning: Theory and Applications

Arxiv

0+阅读 · 2023年6月1日

Efficient and Robust Bayesian Selection of Hyperparameters in Dimension Reduction for Visualization

Arxiv

0+阅读 · 2023年6月1日

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

Arxiv

0+阅读 · 2023年6月1日

High-dimensional Variable Screening via Conditional Martingale Difference Divergence

Arxiv

0+阅读 · 2023年5月31日

Learning to solve Bayesian inverse problems: An amortized variational inference approach

Arxiv

0+阅读 · 2023年5月31日

Direct Diffusion Bridge using Data Consistency for Inverse Problems

Arxiv

0+阅读 · 2023年5月31日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

相关基金

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于广义半参数回归模型的统计推断及其应用研究

国家自然科学基金

2+阅读 · 2013年12月31日

统计学习理论中的分位数回归和MEE算法

国家自然科学基金

1+阅读 · 2012年12月31日

多元整数值GARCH模型的统计分析

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

一类半参数时间序列模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

基本群表示，调和度量的构造及其到上同调的应用

国家自然科学基金

1+阅读 · 2011年12月31日

复形范畴中的Gorenstein同调维数

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员