High utility sequential pattern mining (HUSPM) is a significant and valuable activity in knowledge discovery and data analytics with many real-world applications. In some cases, HUSPM can not provide an excellent measure to predict what will happen. High utility sequential rule mining (HUSRM) discovers high utility and high confidence sequential rules, allowing it to solve the problem in HUSPM. All existing HUSRM algorithms aim to find high-utility partially-ordered sequential rules (HUSRs), which are not consistent with reality and may generate fake HUSRs. Therefore, in this paper, we formulate the problem of high utility totally-ordered sequential rule mining and propose two novel algorithms, called TotalSR and TotalSR+, which aim to identify all high utility totally-ordered sequential rules (HTSRs). TotalSR creates a utility table that can efficiently calculate antecedent support and a utility prefix sum list that can compute the remaining utility in O(1) time for a sequence. We also introduce a left-first expansion strategy that can utilize the anti-monotonic property to use a confidence pruning strategy. TotalSR can also drastically reduce the search space with the help of utility upper bounds pruning strategies, avoiding much more meaningless computation. In addition, TotalSR+ uses an auxiliary antecedent record table to more efficiently discover HTSRs. Finally, there are numerous experimental results on both real and synthetic datasets demonstrating that TotalSR is significantly more efficient than algorithms with fewer pruning strategies, and TotalSR+ is significantly more efficient than TotalSR in terms of running time and scalability.
翻译:高用途连续模式采矿(HUSPM)是知识发现和数据分析方面的重要而宝贵的重要活动,有许多实际应用,因此,在本文中,HUSPM无法提出高用途全顺序规则采矿问题,无法提供预测未来结果的极佳计量。高用途连续规则采矿(HUSRM)发现高用途和高度信任顺序规则,从而能够在HUSPM中解决问题。所有现有的HUSRM算法都旨在找到高用途部分顺序序列规则(HUSRs),这些规则不符合现实,并可能产生假的HURSR。因此,我们在本文件中提出高用途全顺序规则序列规则采矿问题,并提出了两个新的算法,称为TotalSR和TotalSR+,目的是查明所有高用途全顺序连续规则(HTSRMM)的高度效用和高度信任顺序序列规则(HTURMM)发现高用途规则,从而可以有效地计算前置支持和工具前置列表,从而计算出O(1)时间的剩余用途。我们还引入了一种左位扩展战略,可以利用反运动的全序序列序列,从而大大地搜索反运动的全级规则使用全级规则的全序战略。