Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be gained by mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds event time intervals into extracted patterns, making them more expressive at the expense of increased mining time complexity. Existing TPM methods either cannot scale to large datasets, or work only on pre-processed temporal events rather than on time series. This paper presents our Frequent Temporal Pattern Mining from Time Series (FTPMf TS) approach which provides: (1) The end-to-end FTPMf TS process taking time series as input and producing frequent temporal patterns as output. (2) The efficient Hierarchical Temporal Pattern Graph Mining (HTPGM) algorithm that uses efficient data structures for fast support and confidence computation, and employs effective pruning techniques for significantly faster mining. (3) An approximate version of HTPGM that uses mutual information, a measure of data correlation known from information theory, to prune unpromising time series from the search space. (4) An extensive experimental evaluation showing that HTPGM outperforms the baselines in runtime and memory consumption, and can scale to big datasets. The approximate HTPGM is up to two orders of magnitude faster and less memory consuming than the baselines, while retaining high accuracy.
翻译:与传统模式采矿不同,时间型采矿(TPM)将事件时间间隔增加为提取模式,使其更能表达,从而牺牲采矿时间复杂性的增加。现有的TPM方法既不能规模扩大为大型数据集,也不能仅仅在预处理的时间事件上工作,而不能在时间序列上工作。本文介绍了我们从时间序列(TFPMf TS)到经常时间模式采矿(TFPPPf TS)的方法,该方法提供:(1) 将FTPMf TS过程作为时间序列输入并产生经常的时间模式作为产出。(2) 高效的高度时态结构图形采矿(HTPGM)算法,使用高效的数据结构进行快速支持和信心计算,并使用有效的理算技术来大大加快采矿速度。(3) 大约版本的HTPGMGM使用相互信息,一种从信息理论到从搜索空间提取数据序列的数据相关性测量。 (4) 广泛的实验性评估显示,HTPMGM的精确度比高的存储率要快,而高存储率比高的基线比高。