Sometimes, we do not use a maximum likelihood estimator of a probability but it's a smoothed estimator in order to cope with the zero frequency problem. This is often the case when we use the Naive Bayes classifier. Laplace smoothing is a popular choice with the value of Laplace smoothing estimator being the expected value of posterior distribution of the probability where we assume that the prior is uniform distribution. In this paper, we investigate the confidence intervals of the estimator of Laplace smoothing. We show that the likelihood function for this confidence interval is the same as the likelihood of a maximum likelihood estimated value of a probability of Bernoulli trials. Although the confidence interval of the maximum likelihood estimator of the Bernoulli trial probability has been studied well, and although the approximate formulas for the confidence interval are well known, we cannot use the interval of maximum likelihood estimator since the interval contains the value 0, which is not suitable for the Naive Bayes classifier. We are also interested in the accuracy of existing approximation methods since these approximation methods are frequently used but their accuracy is not well discussed. Thus, we obtain the confidence interval by numerically integrating the likelihood function. In this paper, we report the difference between the confidence interval that we computed and the confidence interval by approximate formulas. Finally, we include a URL, where all of the intervals that we computed are available.
翻译:有时,我们并不使用概率的最大可能性估计值,但这是一个平滑的估计值,以便应对零频率问题。当我们使用 Naive Bayes 分类器时,通常是这样。 Laplace 平滑是一个受欢迎的选择,其值为 Laplace 平滑的估量器,其值为 Laplace 平滑的估量器,其值为我们假设先前的分布是统一的概率的后方分布值的预期值。在本文中,我们调查了Laplace 测算器平滑的间隔。我们显示,这个信任间隔的概率功能与Bernoulli 测试的最大可能性估计值的可能性的可能性相同。尽管对Bernoulli 测试概率的最大可能性估计值的最大可能性间隔进行了很好的研究,尽管信任间隔的大致值是Laplace 的预期值,但我们不能使用最大可能性估计值的间隔,因为间隔期包含值0,这不适合Nive Bayes 的分类。我们也很关心现有的近似方法的准确性,因为这些近似方法被经常使用,但 Bernoulli 审判的概率估计值的概率并不相同,因此,我们只能用这个精确的精确的间隔来讨论。