Anomaly detection among a large number of processes arises in many applications ranging from dynamic spectrum access to cybersecurity. In such problems one can often obtain noisy observations aggregated from a chosen subset of processes that conforms to a tree structure. The distribution of these observations, based on which the presence of anomalies is detected, may be only partially known. This gives rise to the need for a search strategy designed to account for both the sample complexity and the detection accuracy, as well as cope with statistical models that are known only up to some missing parameters. In this work we propose a sequential search strategy using two variations of the Generalized Local Likelihood Ratio statistic. Our proposed Hierarchical Dynamic Search (HDS) strategy is shown to be order-optimal with respect to the size of the search space and asymptotically optimal with respect to the detection accuracy. An explicit upper bound on the error probability of HDS is established for the finite sample regime. Extensive experiments are conducted, demonstrating the performance gains of HDS over existing methods.
翻译:从动态频谱访问到网络安全等许多应用中都产生了大量过程的异常探测。在这些问题中,人们往往能够从符合树结构的选定一组过程获得杂乱的观测结果。这些观测结果的分布可能只是部分地为人们所知道,根据这些观测结果,发现异常现象可能只是部分的根据。这就需要一种搜索战略,既考虑到抽样的复杂性,又考虑到探测的准确性,并处理只知道某些缺失参数的统计模型。在这项工作中,我们提出一个顺序搜索战略,采用通用地方近似比率统计的两个变量。我们拟议的高层次动态搜索战略显示,相对于搜索空间的大小而言,是秩序最优化的,对于探测准确性来说也是最适度的。HDS误差概率的明显上限是有限的抽样系统。我们进行了广泛的实验,展示了HDS对现有方法的性能收益。