We compute bias, variance, and approximate confidence intervals for the efficiency of a random selection process under various special conditions that occur in practical data analysis. We consider the following cases: a) the number of trials is not constant but drawn from a Poisson distribution, b) the samples are weighted, c) the numbers of successes and failures have a variance which exceeds that of a Poisson process, which is the case, for example, when these numbers are obtained from a fit to mixture of signal and background events. Generalized Wilson intervals based on these variances are computed, and their coverage probability is studied. The efficiency estimators are unbiased in all considered cases, except when the samples are weighted. The standard Wilson interval is also suitable for case a). For most of the other cases, generalized Wilson intervals can be computed with closed-form expressions.
翻译:我们根据实际数据分析中出现的各种特殊条件,为随机选择过程的效率计算偏差、差异和大致信任间隔。我们考虑下列情况:(a) 试验次数不固定,但取自Poisson分布,(b) 样品是加权的,(c) 成功和失败的次数有差异,超过Poisson过程的差异,例如,这些数字是从适合结合信号和背景事件获得的。根据这些差异计算通用威尔逊间隔,并研究其覆盖概率。除样品是加权的外,所有已考虑的案例中的效率估计都是不偏不倚的。标准威尔逊间隔也适用于情况。对于大多数其他情况,一般威尔逊间隔可以用封闭式表达方式计算。