用于学习概率分布的无人监督树助推 (Unsupervised tree boosting for learning probability distributions) - 专知论文

会员服务 ·

0

Boosting（一种模型训练加速方式） · 无监督 · Learning · Copulas · 可约的 ·

2022 年 7 月 13 日

Unsupervised tree boosting for learning probability distributions

翻译：用于学习概率分布的无人监督树助推

Naoki Awaya,Li Ma

from arxiv, 47 pages, 9 figures

We propose an unsupervised tree boosting algorithm for inferring the underlying sampling distribution of an i.i.d.\ sample based on fitting additive tree ensembles in a fashion analogous to supervised tree boosting. Integral to the algorithm is a new notion of "addition" on probability distributions that leads to a coherent notion of "residualization", i.e., subtracting a probability distribution from an observation to remove the distributional structure from the sampling distribution of the latter. We show that these notions arise naturally for univariate distributions through cumulative distribution function (CDF) transforms and compositions due to several "group-like" properties of univariate CDFs. While the traditional multivariate CDF does not preserve these properties, a new definition of multivariate CDF can restore these properties, thereby allowing the notions of "addition" and "residualization" to be formulated for multivariate settings as well. This then gives rise to the unsupervised boosting algorithm based on forward-stagewise fitting of an additive tree ensemble, which sequentially reduces the Kullback-Leibler divergence from the truth. The algorithm allows analytic evaluation of the fitted density and outputs a generative model that can be readily sampled from. We enhance the algorithm with scale-dependent shrinkage and a two-stage strategy that separately fits the marginals and the copula. The algorithm then performs competitively to state-of-the-art deep-learning approaches in multivariate density estimation on multiple benchmark datasets.

翻译：我们提出一种不受监督的树增殖算法,用以根据与监督树增殖相似的方式,根据安装添加树的树组群来推断i.i.d.\样本,推断i.d.d.\样本的基本采样分布。这种算法是一个关于概率分布的“增加”的新概念,它导致“再化”的一致概念,即从观察中减去概率分布,从采样分布中去掉分配结构。我们表明,这些概念自然产生于通过累积分配函数(CDF)的异变和成份,因为一些“类式”的CDF 特性类似于树组。虽然传统的多变式CDF不保存这些特性,但多变式CDF的新定义可以恢复这些特性,从而使得“再现”和“再现”概念能够从采样分布分布中去除,从而从采样环境的分布结构中去去去去掉。这又使得基于前期调整的树变本变本变本变本变本变本变本变本的递增缩算法的不至不令人信地加固的递增缩算法。

0

相关内容

Boosting（一种模型训练加速方式）

Boosting（一种模型训练加速方式）

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

介孔材料受限空间中的AGET ATRP和ARGET ATRP聚合反应

国家自然科学基金

0+阅读 · 2016年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

条件独立性及其相关假设：基于特征函数的计量检验和实证研究

国家自然科学基金

3+阅读 · 2013年12月31日

益气活血法对大鼠萎缩性胃炎Hedgehog信号通路的调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

有理曲面及其相关问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维Klein群的组合定理及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于几何约束lifting技术的细分小波变换研究

国家自然科学基金

0+阅读 · 2009年12月31日

Riemannian optimization for non-centered mixture of scaled Gaussian distributions

Arxiv

0+阅读 · 2022年9月7日

A Data-dependent Approach for High Dimensional (Robust) Wasserstein Alignment

Arxiv

0+阅读 · 2022年9月7日

A model robust sub-sampling approach for Generalised Linear Models in Big data settings

Arxiv

0+阅读 · 2022年9月6日

Three Distributions in the Extended Occupancy Problem

Arxiv

0+阅读 · 2022年9月6日

Learning Canonical Embeddings for Unsupervised Shape Correspondence with Locally Linear Transformations

Arxiv

0+阅读 · 2022年9月5日

Deep importance sampling using tensor-trains with application to a priori and a posteriori rare event estimation

Arxiv

0+阅读 · 2022年9月5日

Robust Causal Learning for the Estimation of Average Treatment Effects

Arxiv

0+阅读 · 2022年9月5日

ProBoost: a Boosting Method for Probabilistic Classifiers

Arxiv

0+阅读 · 2022年9月4日

Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach

Arxiv

0+阅读 · 2022年9月1日

Structure recovery for partially observed discrete Markov random fields on graphs under not necessarily positive distributions

Arxiv

0+阅读 · 2022年9月1日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】大型语言模型中关系解码线性算子的结构

《大模型一体机应用研究报告（2025年）》，48页pdf

语言模型如何重塑实体对齐？语言模型驱动实体对齐的进展、基准与未来

【CMU博士论文】迈向具备基础先验的四维感知

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Riemannian optimization for non-centered mixture of scaled Gaussian distributions

Arxiv

0+阅读 · 2022年9月7日

A Data-dependent Approach for High Dimensional (Robust) Wasserstein Alignment

Arxiv

0+阅读 · 2022年9月7日

A model robust sub-sampling approach for Generalised Linear Models in Big data settings

Arxiv

0+阅读 · 2022年9月6日

Three Distributions in the Extended Occupancy Problem

Arxiv

0+阅读 · 2022年9月6日

Learning Canonical Embeddings for Unsupervised Shape Correspondence with Locally Linear Transformations

Arxiv

0+阅读 · 2022年9月5日

Deep importance sampling using tensor-trains with application to a priori and a posteriori rare event estimation

Arxiv

0+阅读 · 2022年9月5日

Robust Causal Learning for the Estimation of Average Treatment Effects

Arxiv

0+阅读 · 2022年9月5日

ProBoost: a Boosting Method for Probabilistic Classifiers

Arxiv

0+阅读 · 2022年9月4日

Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach

Arxiv

0+阅读 · 2022年9月1日

Structure recovery for partially observed discrete Markov random fields on graphs under not necessarily positive distributions

Arxiv

0+阅读 · 2022年9月1日

相关基金

介孔材料受限空间中的AGET ATRP和ARGET ATRP聚合反应

国家自然科学基金

0+阅读 · 2016年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

条件独立性及其相关假设：基于特征函数的计量检验和实证研究

国家自然科学基金

3+阅读 · 2013年12月31日

益气活血法对大鼠萎缩性胃炎Hedgehog信号通路的调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

有理曲面及其相关问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维Klein群的组合定理及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于几何约束lifting技术的细分小波变换研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员