We revisit the outlier hypothesis testing framework of Li \emph{et al.} (TIT 2014) and derive fundamental limits for the optimal test. In outlier hypothesis testing, one is given multiple observed sequences, where most sequences are generated i.i.d. from a nominal distribution. The task is to discern the set of outlying sequences that are generated according to anomalous distributions. The nominal and anomalous distributions are \emph{unknown}. We consider the case of multiple outliers where the number of outliers is unknown and each outlier can follow a different anomalous distribution. Under this setting, we study the tradeoff among the probabilities of misclassification error, false alarm and false reject. Specifically, we propose a threshold-based test that ensures exponential decay of misclassification error and false alarm probabilities. We study two constraints on the false reject probability, with one constraint being that it is a non-vanishing constant and the other being that it has an exponential decay rate. For both cases, we characterize bounds on the false reject probability, as a function of the threshold, for each tuple of nominal and anomalous distributions. Finally, we demonstrate the asymptotic optimality of our test under the generalized Neyman-Pearson criterion.
翻译:我们重新审视Li \ emph{et al.} (TIT 2014) 的外部假设测试框架, 并得出最佳测试的基本限值。 在外部假设测试中, 给一个人给出了多个观察到的序列, 其中多数序列来自名义分布。 任务在于辨别根据异常分布生成的外围序列。 名义和异常分布为 \ emph{ 未知} 。 我们考虑的是多个外部用户的情况, 外部用户数量未知, 每个外部用户可以遵循不同的异常分布 。 在此设定下, 我们研究错误分类错误、 错误警报和错误拒绝的概率之间的权衡。 具体地说, 我们提出一个基于阈值的测试, 以确保错误分类错误分布和错误警报概率的指数加速衰减。 我们研究了关于虚假拒绝概率的两种限制, 其中一个制约是它是一个不吉常数的常数, 另一个制约是它有一个指数衰减率。 对于这两种情况, 我们将错误拒绝概率的概率作为我们每个标准的一个标准, 的典型标准, 我们将每个标准下的最高标准下的最高标准 。