How can we track synchronized behavior in a stream of time-stamped tuples, such as mobile devices installing and uninstalling applications in the lockstep, to boost their ranks in the app store? We model such tuples as entries in a streaming tensor, which augments attribute sizes in its modes over time. Synchronized behavior tends to form dense blocks (i.e. subtensors) in such a tensor, signaling anomalous behavior, or interesting communities. However, existing dense block detection methods are either based on a static tensor, or lack an efficient algorithm in a streaming setting. Therefore, we propose a fast streaming algorithm, AugSplicing, which can detect the top dense blocks by incrementally splicing the previous detection with the incoming ones in new tuples, avoiding re-runs over all the history data at every tracking time step. AugSplicing is based on a splicing condition that guides the algorithm (Section 4). Compared to the state-of-the-art methods, our method is (1) effective to detect fraudulent behavior in installing data of real-world apps and find a synchronized group of students with interesting features in campus Wi-Fi data; (2) robust with splicing theory for dense block detection; (3) streaming and faster than the existing streaming algorithm, with closely comparable accuracy.
翻译:我们怎样才能在时间标记的图例流中跟踪同步行为,比如移动装置安装和卸载应用程序,以提升其在软件库中的级别? 我们如何在程序库中跟踪时间标记的图例,以提升其级别? 我们将图例作为串流中的项目模型,这样会随着时间的推移增加其模式的属性大小。 同步行为往往会形成密度的区块( 即子梯度), 信号异常行为, 或者有趣的社区。 但是, 现有的密集区块探测方法要么基于静态阵列, 要么在数据流设置中缺乏高效的算法。 因此, 我们提出快速流算法, AugSpllicing, 它可以通过在新图册中以递增方式与新图案的进取者进行检测, 避免在每一个跟踪时间步骤中重新运行所有历史数据。 AugsplicalSplical 以指导算法( 第4节), 与最新工艺方法相比, 我们的方法可以(1) 有效地检测在安装真实的系统数据库数据中, 与现有快速的系统流中, 和同步的系统智能搜索系统流中找到一个令人感兴趣的数据。