Robust change-point detection for large-scale data streams has many real-world applications in industrial quality control, signal detection, biosurveillance. Unfortunately, it is highly non-trivial to develop efficient schemes due to three challenges: (1) the unknown sparse subset of affected data streams, (2) the unexpected outliers, and (3) computational scalability for real-time monitoring and detection. In this article, we develop a family of efficient real-time robust detection schemes for monitoring large-scale independent data streams. For each data stream, we propose to construct a new local robust detection statistic called $L_{\alpha}$-CUSUM statistic that can reduce the effect of outliers by using the Box-Cox transformation of the likelihood function. Then the global scheme will raise an alarm based upon the sum of the shrinkage transformation of these local $L_{\alpha}$-CUSUM statistics so as to filter out unaffected data streams. In addition, we propose a new concept called {\em false alarm breakdown point} to measure the robustness of online monitoring schemes and propose a worst-case detection efficiency score to measure the detection efficiency when the data contain outliers. We then characterize the breakdown point and the efficiency score of our proposed schemes. Asymptotic analysis and numerical simulations are conducted to illustrate the robustness and efficiency of our proposed schemes.
翻译:大型数据流的强变点探测系统有许多在工业质量控制、信号检测、生物监视方面的现实世界应用。 不幸的是,由于以下三个挑战,制定高效计划是非三重性的: (1) 受影响数据流的未知稀疏子,(2) 意外的离线,(3) 实时监测和检测的计算缩放。在本篇文章中,我们形成了一套高效实时强变探测系统,用于监测大规模独立数据流。对于每个数据流,我们提议建立一个名为$Läalpha}-CUUUM的新的本地强效检测数据,通过利用概率功能的盒子-Cox转换,可以降低外层的影响。然后,全球计划将根据这些本地的$Láalffa}-CUSUM统计数据的缩放总和,发出警报,以便过滤不受影响的数据流。此外,我们提出了一个新的概念,称为“假警报分解点 ”, 以衡量在线监测计划的稳健性,并提出最坏的检测效率分数,以测量在进行稳健的图像分析时,衡量我们的拟议的测算效率。