Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be obtained through mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds additional temporal aspect into extracted patterns, thus making them more expressive. However, adding the temporal dimension into patterns results in an exponential growth of the search space, significantly increasing the mining process complexity. Current TPM approaches either cannot scale to large datasets, or typically work on pre-processed event sequences rather than directly on time series. This paper presents our comprehensive Frequent Temporal Pattern Mining from Time Series (FTPMfTS) approach which provides the following contributions: (1) The end-to-end FTPMfTS process that directly takes time series as input and produces frequent temporal patterns as output. (2) The efficient Hierarchical Temporal Pattern Graph Mining (HTPGM) algorithm that uses efficient data structures to enable fast computations for support and confidence. (3) A number of pruning techniques for HTPGM that yield significantly faster mining. (4) An approximate version of HTPGM which relies on mutual information to prune unpromising time series, and thus significantly reduce the search space. (5) An extensive experimental evaluation on real-world datasets from the energy and smart city domains which shows that HTPGM outperforms the baselines and can scale to large datasets. The approximate HTPGM achieves up to 3 orders of magnitude speedup compared to the baselines and consumes significantly less memory, while obtaining high accuracy compared to the exact HTPGM.
翻译:从分布在不同环境中的日益扩大的由IoT驱动的传感器越来越多地获得大量的时间序列。通过这些时间序列的采矿时间模式,可以从这些时间序列的采矿时间模式中获得重要的洞察力。与传统的模式采矿不同,时间模式采矿(TPM)在提取的模式中增加了额外的时间层面,从而使其更加直观。不过,将时间维度添加到模式中导致搜索空间的指数性增长,大大增加了采矿过程的复杂性。目前的TPM 方法要么无法向大型数据集扩展,要么通常无法在预处理事件序列上开展工作,而不是直接在时间序列上开展工作。本文介绍了我们从时间序列(TFPMfTS)中全面的时间周期模式采矿(HTPMTS)方法,该方法提供了以下贡献:(1) 端到端的FTPMTS进程直接将时间序列作为输入,并产生频繁的时间模式作为输出。(2) 高效的高度静态温度模式采矿(HTPGMM)算法能够快速计算支持和信心。(3) 高速度采矿的HTPGM的运行技术。 (4) HTPGMGM的大致版本,该模型的准确性版本,在共同信息中可以比高级搜索到智能序列中,从而大大地实现大规模搜索和智能序列,而不能数据序列。