Anomaly detection among a large number of processes arises in many applications ranging from dynamic spectrum access to \textcolor{NewColor}{cybersecurity}. In such problems one can often obtain noisy observations aggregated from a chosen subset of processes \textcolor{NewColor}{that} {conforms} to a tree structure. The distribution of these observation\textcolor{NewColor}{s}, based on which the presence of anomalies is detected, may be only partially known. This gives rise to the need for a search strategy designed to account for both the sample complexity and the detection accuracy, as well as cope with statistical models that are known only up to some missing parameters. In this work we propose \textcolor{NewColor}{a} sequential search strategy using two variations of the \acl{gllr} \textcolor{NewColor}{statistic}. Our proposed \ac{hds} strategy is shown to be order-optimal with respect to the size of the search space and asymptotically optimal with respect to the detection accuracy. An explicit upper bound on the error probability of \ac{hds} is established for the finite sample regime. Extensive experiments are conducted, demonstrating the \textcolor{NewColor}{performance} gains of \ac{hds} over existing methods.
翻译:从动态频谱存取到\ textcolor{ nNewColor ⁇ cybersecurity} 等许多应用中都会出现大量过程的异常检测。 在这些问题中,人们往往可以得到从选择的进程子集\ textcolor{ nNewColor}} { conforms} 到树结构的杂音观测。 这些观测\ textcolor{ nNewColor} 的分布可能只是部分已知的异常现象。 这就需要一种搜索策略, 以计算样本的复杂程度和检测的准确性, 以及应对仅知道某些缺失参数的统计模型。 在这个工作中, 我们建议使用 \ aclor{ nNewColor{ a} 序列搜索策略的两种变异样。 我们提议的 & ac{ ac{hd} 策略在搜索空间的大小和检测准确性能方面都符合秩序, 并且与检测准确性相符, 并应对统计模型的精确性模型。 在新\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\