We study the relative entropy of the empirical probability vector with respect to the true probability vector in multinomial sampling of $k$ categories, which, when multiplied by sample size $n$, is also the log-likelihood ratio statistic. We generalize a recent result and show that the moment generating function of the statistic is bounded by a polynomial of degree $n$ on the unit interval, uniformly over all true probability vectors. We characterize the family of polynomials indexed by $(k,n)$ and obtain explicit formulae. Consequently, we develop Chernoff-type tail bounds, including a closed-form version from a large sample expansion of the bound minimizer. Our bound dominates the classic method-of-types bound and is competitive with the state of the art. We demonstrate with an application to estimating the proportion of unseen butterflies.
翻译:我们研究实验性概率矢量相对于美元类别多数值抽样中真实概率矢量的相对倍数,当用样本大小乘以一美元时,这也是日志相似率统计。我们概括了最近的结果,并表明该统计的瞬间生成功能在单位间隔上受多等量美元的约束,在所有真实概率矢量上均匀。我们用(k)美元指数的多数值矢量的类别来描述多数值矢量,并获得明确的公式。因此,我们开发了切尔诺夫型尾盘,包括从大样本扩展的最小值中获取的封闭式版本。我们的边框控制了典型的型号方法约束,并且与艺术状态具有竞争力。我们用一种应用来估计未见的蝴蝶的比例来进行演示。