We prove an exponential decay concentration inequality to bound the tail probability of the difference between the log-likelihood of discrete random variables on a finite alphabet and the negative entropy. The concentration bound we derive holds uniformly over all parameter values. The new result improves the convergence rate in an earlier result of Zhao (2020), from $(K^2\log K)/n=o(1)$ to $ (\log K)^2/n=o(1)$, where $n$ is the sample size and $K$ is the size of the alphabet. We further prove that the rate $(\log K)^2/n=o(1)$ is optimal. The results are extended to misspecified log-likelihoods for grouped random variables. We give applications of the new result in information theory.
翻译:我们证明指数衰变浓度不平等,可以将固定字母上离散随机变量和负英特罗比的对数概率差的尾概率捆绑起来。我们所得的浓度一致地维持在所有参数值之上。新的结果提高了赵(2020年)早期结果的趋同率,从(K)2\log K)/n=o(1)美元提高到(g)K)2/2/n=o(1)美元,其中一美元是样本大小,一美元是字母大小。我们进一步证明美元(log K)2/n=o(1)美元是最佳的。结果扩大到分类随机变量的误定义的对日特异系数。我们在信息理论中应用新结果。