指数式家庭信息投影的MLE趋同速度:模型尺寸和样本大小的标准 -- -- 完整的证明版本 -- -- (MLE convergence speed to information projection of exponential family: Criterion for model dimension and sample size -- complete proof version--) - 专知论文

会员服务 ·

0

INFORMS · 极大似然估计 · MoDELS · 准则 · 估计/估计量 ·

2021 年 6 月 3 日

MLE convergence speed to information projection of exponential family: Criterion for model dimension and sample size -- complete proof version--

翻译：指数式家庭信息投影的MLE趋同速度:模型尺寸和样本大小的标准 -- -- 完整的证明版本 -- --

For a parametric model of distributions, the closest distribution in the model to the true distribution located outside the model is considered. Measuring the closeness between two distributions with the Kullback-Leibler (K-L) divergence, the closest distribution is called the "information projection." The estimation risk of the maximum likelihood estimator (MLE) is defined as the expectation of K-L divergence between the information projection and the predictive distribution with plugged-in MLE. Here, the asymptotic expansion of the risk is derived up to $n^{-2}$-order, and the sufficient condition on the risk for the Bayes error rate between the true distribution and the information projection to be lower than a specified value is investigated. Combining these results, the "$p-n$ criterion" is proposed, which determines whether the MLE is sufficiently close to the information projection for the given model and sample. In particular, the criterion for an exponential family model is relatively simple and can be used for a complex model with no explicit form of normalizing constant. This criterion can constitute a solution to the sample size or model acceptance problem. Use of the $p-n$ criteria is demonstrated for two practical datasets. The relationship between the results and information criteria is also studied.

翻译：对于分布的参数模型,将考虑模型中最接近于模型外真实分布的分布模式。测量与 Kullback- Leiberr (K-L) 差差的两种分布之间的近距离,最接近的分布被称为“ 信息预测”。最大可能性估计值(MLE)的估计风险被定义为信息预测与插插入MLE的预测分布之间K-L差的预期差。这里,风险的无症状扩展可得出最高为$ ⁇ -2} 美元顺序,以及真实分布与信息预测低于特定值的贝斯错误率的充分条件。将这些结果结合起来, 提出“ $p- n$ 标准 ”, 确定最大可能性估计值是否与特定模型和样本的信息预测相近。特别是, 指数型家庭模型的标准相对简单, 可用于没有明确形式正常的复杂模型。这一标准可以构成样本大小或模型接受率问题的解决办法。使用美元- 和数据所研究的数据标准是。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

106+阅读 · 2021年2月27日

【哈佛大学干货书】概率导论，589页pdf，Introduction to Probability

【哈佛大学干货书】概率导论，589页pdf，Introduction to Probability

专知会员服务

138+阅读 · 2021年1月24日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

122+阅读 · 2020年5月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

专知会员服务

275+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

10+阅读 · 2018年5月2日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

Arxiv

0+阅读 · 2021年7月28日

Parameter-uniform numerical methods for singularly perturbed linear transport problems

Arxiv

0+阅读 · 2021年7月27日

Information criteria for non-normalized models

Information criteria for non-normalized models

Arxiv

0+阅读 · 2021年7月27日

Probabilistic selection of inducing points in sparse Gaussian processes

Arxiv

0+阅读 · 2021年7月25日

Minimising quantifier variance under prior probability shift

Arxiv

0+阅读 · 2021年7月24日

Generalization Bounds in the Predict-then-Optimize Framework

Arxiv

0+阅读 · 2021年7月23日

Geometric convergence of elliptical slice sampling

Arxiv

0+阅读 · 2021年7月23日

Linear spectral statistics of sequential sample covariance matrices

Linear spectral statistics of sequential sample covariance matrices

Arxiv

0+阅读 · 2021年7月23日

Distances between probability distributions of different dimensions

Arxiv

0+阅读 · 2021年7月23日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

极大似然估计

估计/估计量

相关VIP内容

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

106+阅读 · 2021年2月27日

【哈佛大学干货书】概率导论，589页pdf，Introduction to Probability

【哈佛大学干货书】概率导论，589页pdf，Introduction to Probability

专知会员服务

138+阅读 · 2021年1月24日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

122+阅读 · 2020年5月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

专知会员服务

275+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

10+阅读 · 2018年5月2日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

Arxiv

0+阅读 · 2021年7月28日

Parameter-uniform numerical methods for singularly perturbed linear transport problems

Arxiv

0+阅读 · 2021年7月27日

Information criteria for non-normalized models

Information criteria for non-normalized models

Arxiv

0+阅读 · 2021年7月27日

Probabilistic selection of inducing points in sparse Gaussian processes

Arxiv

0+阅读 · 2021年7月25日

Minimising quantifier variance under prior probability shift

Arxiv

0+阅读 · 2021年7月24日

Generalization Bounds in the Predict-then-Optimize Framework

Arxiv

0+阅读 · 2021年7月23日

Geometric convergence of elliptical slice sampling

Arxiv

0+阅读 · 2021年7月23日

Linear spectral statistics of sequential sample covariance matrices

Linear spectral statistics of sequential sample covariance matrices

Arxiv

0+阅读 · 2021年7月23日

Distances between probability distributions of different dimensions

Arxiv

0+阅读 · 2021年7月23日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员