差异概率分布功能缺失质量的估算和集中 (Estimation and Concentration of Missing Mass of Functions of Discrete Probability Distributions)

Given a positive function $g$ from $[0,1]$ to the reals, the function's missing mass in a sequence of iid samples, defined as the sum of $g(pr(x))$ over the missing letters $x$, is introduced and studied. The missing mass of a function generalizes the classical missing mass, and has several interesting connections to other related estimation problems. Minimax estimation is studied for order-$\alpha$ missing mass ($g(p)=p^{\alpha}$) for both integer and non-integer values of $\alpha$. Exact minimax convergence rates are obtained for the integer case. Concentration is studied for a class of functions and specific results are derived for order-$\alpha$ missing mass and missing Shannon entropy ($g(p)=-p\log p$). Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.

翻译：鉴于正函数$g美元,从$[0,1美元]到实际,该函数在一系列iid样本中缺失质量,其定义为在缺失字母的x美元中等于(pr(xx)美元)的数值,引入并研究该函数的缺失质量。函数的缺失质量概括了古典缺失质量,与其他相关估算问题有几处有趣的联系。对单价-$/alpha$的缺失质量(g(p)=p ⁇ alpha}$)进行了最小估计,以整数和非整数值的值($/alpha$)为单位。为整数案件获得了超大微型趋同率。对某类功能的集中度进行了研究,并得出了某类函数的浓度和具体结果,以Sonna-alpha$(p)=-p\log p$)为单位,与其他相关的估计问题有几处有趣的联系。对近最佳情况差异因子-Gausian尾线进行了研究。引入了两种新的浓度概念,称为强烈的亚伽玛和经过过滤的亚-Gaussi浓度,在右尾框中的结果优于从子浓度。

相关内容

MASS

关注 0

MASS：IEEE International Conference on Mobile Ad-hoc and Sensor Systems。 Explanation：移动Ad hoc和传感器系统IEEE国际会议。 Publisher：IEEE。 SIT： http://dblp.uni-trier.de/db/conf/mass/index.html

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日