Monitoring of streamed data to detect abnormal behaviour (variously known as event detection, anomaly detection, change detection, or outlier detection) underlies many applications of the Internet of Things. There, one often collects data from a variety of sources, with asynchronous sampling, and missing data. In this setting, one can predict abnormal behavior using low-rank techniques. In particular, we assume that normal observations come from a low-rank subspace, prior to being corrupted by a uniformly distributed noise. Correspondingly, we aim to recover a representation of the subspace, and perform event detection by running point-to-subspace distance query for incoming data. In particular, we use a variant of low-rank factorisation, which considers interval uncertainty sets around "known entries", on a suitable flattening of the input data to obtain a low-rank model. On-line, we compute the distance of incoming data to the low-rank normal subspace and update the subspace to keep it consistent with the seasonal changes present. For the distance computation, we suggest to consider subsampling. We bound the one-sided error as a function of the number of coordinates employed using techniques from learning theory and computational geometry. In our experimental evaluation, we have tested the ability of the proposed algorithm to identify samples of abnormal behavior in induction-loop data from Dublin, Ireland.
翻译:用于检测异常行为的流数据监测流数据( 通常被称为事件探测、异常检测、 变化检测或异常检测) 在互联网的许多应用中, 以物源的互联网应用为基础。 在那里, 经常从各种来源收集数据, 使用非同步抽样, 缺少数据 。 在这种环境下, 可以使用低级别技术预测异常行为 。 特别是, 我们假设正常观测来自低级别子空间, 在被统一分布的噪音腐蚀之前, 位于低级别子空间 。 相应地, 我们的目标是恢复子空间的表示, 并通过运行点到子空间的远程查询来检测事件 。 特别是, 我们使用一种低级别分位分位分位分位化的变量, 考虑“ 已知条目” 周围的间隙不确定性 。 在适当的输入数据平整时, 可以使用低级别模型来预测异常的异常行为 。 我们在线计算进位数据到低级别子空间的距离, 并更新子空间以保持当前季节性变化 。 关于远程计算, 我们建议考虑子扫描 。 我们将单位误差误差, 将 用于 实验性 实验性 模型 分析 我们使用 方法 的 测试 。