从稳健的测试到类似贝耶的后座分布 (From robust tests to Bayes-like posterior distributions)

In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions for estimating the law of an $n$-sample. The loss functions we have in mind are based on the total variation distance, the Hellinger distance as well as some $\mathbb{L}_{j}$-distances. We prove that, with a probability close to one, this new posterior distribution concentrates its mass in a neighbourhood of the law of the data, for the chosen loss function, provided that this law belongs to the support of the prior or, at least, lies close enough to it. We therefore establish that the new posterior distribution enjoys some robustness properties with respect to a possible misspecification of the prior, or more precisely, its support. For the total variation and squared Hellinger losses, we also show that the posterior distribution keeps its concentration properties when the data are only independent, hence not necessarily i.i.d., provided that most of their marginals are close enough to some probability distribution around which the prior puts enough mass. The posterior distribution is therefore also stable with respect to the equidistribution assumption. We illustrate these results by several applications. We consider the problems of estimating a location parameter or both the location and the scale of a density in a nonparametric framework. Finally, we also tackle the problem of estimating a density, with the squared Hellinger loss, in a high-dimensional parametric model under some sparcity conditions. The results established in this paper are non-asymptotic and provide, as much as possible, explicit constants.

翻译：在贝耶斯范式和某个特定损失函数中,我们建议建造一种新型的事后分配方法,用于估算美元样本法则。我们所想到的损失功能是基于总变差距离、希腊距离以及一些美元=mathbb{L ⁇ j}美元-距离。我们证明,在可能性接近于一个的情况下,这种新的后移分布方法将质量集中在数据定律附近,对于选定的损失功能,只要这一法律属于前一或至少接近于前一或更接近于此法则的支持。因此,我们确定,新的后移分布方法具有一些稳健性特性,与前一或更精确的偏差距离有关。对于全部变差和正方差损失,我们还证明,如果数据只是独立的,因此不一定是模型i.d.,那么它们的边际分布方法就足够接近于前一或最接近的概率分布。因此,新的后移分布方法具有一定的稳健性性性性性特性,因此,我们用一个固定的测算法度模型来估计一个稳定的地标值位置。我们用一个固定的测算法度,我们用一个固定的测测测测测测测测测的方方的方的方结果。我们最后以测测测测测测测测测测测地位置,我们测测测测测测测测测测测测测测测的地的地的方的方的方的方结果。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

428+阅读 · 2021年1月11日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日