The optimal receiver operating characteristic (ROC) curve, giving the maximum probability of detection as a function of the probability of false alarm, is a key information-theoretic indicator of the difficulty of a binary hypothesis testing problem (BHT). It is well known that the optimal ROC curve for a given BHT, corresponding to the likelihood ratio test, is theoretically determined by the probability distribution of the observed data under each of the two hypotheses. In some cases, these two distributions may be unknown or computationally intractable, but independent samples of the likelihood ratio can be observed. This raises the problem of estimating the optimal ROC for a BHT from such samples. The maximum likelihood estimator of the optimal ROC curve is derived, and it is shown to converge to the true optimal ROC curve in the \levy\ metric, as the number of observations tends to infinity. A classical empirical estimator, based on estimating the two types of error probabilities from two separate sets of samples, is also considered. The maximum likelihood estimator is observed in simulation experiments to be considerably more accurate than the empirical estimator, especially when the number of samples obtained under one of the two hypotheses is small. The area under the maximum likelihood estimator is derived; it is a consistent estimator of the true area under the optimal ROC curve.
翻译:最佳接收器操作特征(ROC)曲线,其最大探测概率是假警报概率的函数,因此,最理想的接收器运行特征(ROC)曲线是衡量二进假设测试问题(BHT)难度的关键信息理论指标。众所周知,与概率比率测试相对应的,特定BHT的最佳ROC曲线在理论上是由两种假设下观察到的数据的概率分布所决定的。在某些情况下,这两种分布可能是未知的或计算难测的,但可以观察到概率比率的独立样本。这引起了从这些样本中估算BHT最佳的ROC问题。最佳ROC曲线的最大概率估计值是推算的。最佳 最佳的ROC曲线的最大概率估计值与真正最佳的ROC曲线相对照合,因为观测次数往往不精确。还考虑了基于估计两种不同样本的两种错误概率的典型经验性估计值,但可以观察到的概率最高。在两次模拟实验中观察到的测算结果,在两次测算中,在两次测算中,在一次测算的测算中,在两次测算中,其最大的概率是一次测算中,在一次测算中,在一次测算中,其最有可能是一次。