We revisit the universal outlier hypothesis testing (Li \emph{et al.}, TIT 2014) and derive fundamental limits for the optimal test. In outlying hypothesis testing, one is given multiple observed sequences, where most sequences are generated i.i.d. from a nominal distribution. The task is to discern the set of outlying sequences that are generated according to anomalous distributions. The nominal and anomalous distributions are \emph{unknown}. We study the tradeoff among the probabilities of misclassification error, false alarm and false reject for tests that satisfy weak conditions on the rate of decrease of these error probabilities as a function of sequence length. Specifically, we propose a threshold-based universal test that ensures exponential decay of misclassification error and false alarm probabilities. We study two constraints on the false reject probabilities, one is that it be a non-vanishing constant and the other is that it have an exponential decay rate. For both cases, we characterize bounds on the false reject probability, as a function of the threshold, for each pair of nominal and anomalous distributions and demonstrate the optimality of our test in the generalized Neyman-Pearson sense. We first consider the case of at most one outlier and then generalize our results to the case of multiple outliers where the number of outliers is unknown and each outlier can follow a different anomalous distribution.
翻译:我们重新审视普世外部假设测试(Li \ emph{et al.}, TIT 2014), 并为最佳测试获取基本限值。 在外围假设测试中, 给一个人给出了多个观察到的序列, 其中多数序列来自名义分布 。 任务在于辨别根据异常分布生成的一组偏差序列。 名义和异常分布为 \ emph{ 未知} 。 我们研究了错误分类错误、 虚假警报和错误拒绝的概率之间的权衡, 测试满足了降低错误概率的微弱条件, 作为序列长度的函数。 具体地说, 我们提出一个基于阈值的普遍测试, 以确保错误分类错误和错误警报概率的指数衰减。 我们对错误拒绝概率的两种限制进行了研究, 一种是非否定常数的常数, 另一种是指数衰减率。 对于这两种情况, 我们把错误拒绝概率的概率绑定在错误的概率上, 作为最不确定的临界值的一个函数, 在最不透明的标准中, 我们每个标准和最不确定的比值的每对一个案例进行最普通的测试。