变换者能够多贝耶斯推断 (Transformers Can Do Bayesian Inference) - 专知论文

会员服务 ·

0

贝叶斯推断 · 学成 · 推断 · 前向传播/正向传播 · 近似 ·

2022 年 2 月 8 日

Transformers Can Do Bayesian Inference

翻译：变换者能够多贝耶斯推断

Samuel Müller,Noah Hollmann,Sebastian Pineda Arango,Josif Grabocka,Frank Hutter

from arxiv, Accepted at ICLR 2022

Currently, it is hard to reap the benefits of deep learning for Bayesian methods, which allow the explicit specification of prior knowledge and accurately capture model uncertainty. We present Prior-Data Fitted Networks (PFNs). PFNs leverage large-scale machine learning techniques to approximate a large set of posteriors. The only requirement for PFNs to work is the ability to sample from a prior distribution over supervised learning tasks (or functions). Our method restates the objective of posterior approximation as a supervised classification problem with a set-valued input: it repeatedly draws a task (or function) from the prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points. Presented with a set of samples from a new supervised learning task as input, PFNs make probabilistic predictions for arbitrary other data points in a single forward propagation, having learned to approximate Bayesian inference. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems, with over 200-fold speedups in multiple setups compared to current methods. We obtain strong results in very diverse areas such as Gaussian process regression, Bayesian neural networks, classification for small tabular data sets, and few-shot image classification, demonstrating the generality of PFNs. Code and trained PFNs are released at https://github.com/automl/TransformersCanDoBayesianInference.

翻译：目前,很难从Bayesian方法的深层学习中获益,这些方法使得先前知识的清晰规格和准确捕获模型的不确定性能够得到精确的描述。我们展示了一套数据点及其标签。PFNs利用大型机器学习技术来接近大批后座。PFNs工作的唯一要求是能够从先前的分布中抽取受监督的学习任务(或功能)的样本。我们的方法重申后座近似的目标,将其作为一个有监督的分类问题,并附有一套定值输入:它反复从前面提取一个任务(或函数),从前面绘制一套数据点及其标签。我们展示了一组数据点及其标签,掩盖了其中的一个标签,并学习了根据数据点其余部分的定值投入量来为它作出概率性预测。根据一组新的受监督的学习任务作为投入的样本来展示。我们的方法再次将后座近线近线近于一个总数据点,并了解了Bayescomcomerence 。我们证明PFNSs可以近似地展示当前纸质上层的粘度轨道,我们所了解的纸质级平级/轨道的精确的轨道进程。

0

相关内容

贝叶斯推断

贝叶斯推断

贝叶斯推断（BAYESIAN INFERENCE）是一种应用于不确定性条件下的决策的统计方法。贝叶斯推断的显著特征是，为了得到一个统计结论能够利用先验信息和样本信息。

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

专知会员服务

40+阅读 · 2022年3月19日

【ICLR2022】Transformers亦能贝叶斯推断

【ICLR2022】Transformers亦能贝叶斯推断

专知会员服务

25+阅读 · 2021年12月23日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

基于高斯过程模型的桥梁结构动力不确定性研究的解析方法

国家自然科学基金

0+阅读 · 2015年12月31日

考虑不确定性的结构动力学响应模型可信度确认方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

随机变量结构的模型论

国家自然科学基金

0+阅读 · 2013年12月31日

基于多智能体的GIS成矿预测模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

基于多辅助变量的区域土壤可蚀性因子制图及其不确定性分解研究

国家自然科学基金

0+阅读 · 2012年12月31日

结构方程模型中基于充分降维技术的变量选择和模型诊断

国家自然科学基金

1+阅读 · 2012年12月31日

区域性典型结构性砂动力弱稳定性和不确定性试验分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于bayesian网络的面部情感判别分析研究

国家自然科学基金

0+阅读 · 2008年12月31日

Quantum Bayesian Statistical Inference

Arxiv

0+阅读 · 2022年4月19日

Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年4月19日

Reversible Gromov-Monge Sampler for Simulation-Based Inference

Arxiv

0+阅读 · 2022年4月18日

Subset selection for linear mixed models

Arxiv

1+阅读 · 2022年4月18日

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Arxiv

1+阅读 · 2022年4月15日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

End-to-End Sensitivity-Based Filter Pruning

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

贝叶斯推断

前向传播/正向传播

相关VIP内容

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

专知会员服务

40+阅读 · 2022年3月19日

【ICLR2022】Transformers亦能贝叶斯推断

【ICLR2022】Transformers亦能贝叶斯推断

专知会员服务

25+阅读 · 2021年12月23日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Quantum Bayesian Statistical Inference

Arxiv

0+阅读 · 2022年4月19日

Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年4月19日

Reversible Gromov-Monge Sampler for Simulation-Based Inference

Arxiv

0+阅读 · 2022年4月18日

Subset selection for linear mixed models

Arxiv

1+阅读 · 2022年4月18日

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Arxiv

1+阅读 · 2022年4月15日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

End-to-End Sensitivity-Based Filter Pruning

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

基于高斯过程模型的桥梁结构动力不确定性研究的解析方法

国家自然科学基金

0+阅读 · 2015年12月31日

考虑不确定性的结构动力学响应模型可信度确认方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

随机变量结构的模型论

国家自然科学基金

0+阅读 · 2013年12月31日

基于多智能体的GIS成矿预测模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

基于多辅助变量的区域土壤可蚀性因子制图及其不确定性分解研究

国家自然科学基金

0+阅读 · 2012年12月31日

结构方程模型中基于充分降维技术的变量选择和模型诊断

国家自然科学基金

1+阅读 · 2012年12月31日

区域性典型结构性砂动力弱稳定性和不确定性试验分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于bayesian网络的面部情感判别分析研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员