The goal of high-utility sequential pattern mining (HUSPM) is to efficiently discover profitable or useful sequential patterns in a large number of sequences. However, simply being aware of utility-eligible patterns is insufficient for making predictions. To compensate for this deficiency, high-utility sequential rule mining (HUSRM) is designed to explore the confidence or probability of predicting the occurrence of consequence sequential patterns based on the appearance of premise sequential patterns. It has numerous applications, such as product recommendation and weather prediction. However, the existing algorithm, known as HUSRM, is limited to extracting all eligible rules while neglecting the correlation between the generated sequential rules. To address this issue, we propose a novel algorithm called correlated high-utility sequential rule miner (CoUSR) to integrate the concept of correlation into HUSRM. The proposed algorithm requires not only that each rule be correlated but also that the patterns in the antecedent and consequent of the high-utility sequential rule be correlated. The algorithm adopts a utility-list structure to avoid multiple database scans. Additionally, several pruning strategies are used to improve the algorithm's efficiency and performance. Based on several real-world datasets, subsequent experiments demonstrated that CoUSR is effective and efficient in terms of operation time and memory consumption.
翻译:高功率序列型采矿(HUSPM)的目标是在大量序列中有效发现有利可图或有用的连续型模式。然而,仅仅了解符合实用性要求的模式不足以作出预测。为了弥补这一缺陷,高功率连续型规则采矿(HUSRM)旨在探索根据前提顺序模式外观预测后果顺序模式的发生的信心或概率。它有许多应用,如产品建议和天气预测。然而,现有的算法,称为HUSRM(HUSRM),限于提取所有合格规则,而忽视生成的顺序规则之间的关联。为了解决这一问题,我们提出了一个称为相关高功率顺序规则采矿者(CUSR)的新算法,将相关性概念纳入到HUSRM。 拟议的算法不仅要求每项规则相互关联,而且还要求前期和高功率顺序规则的格局是相互关联的。算法采用了一种实用性列表结构,以避免多个数据库扫描。此外,为了改进生成的顺序规则之间的关联性规则之间的关联性关系,我们提出了一种名为相关高功率顺序规则(CUSR)的新算法,在实际的存储中展示了一些时间和业绩。