目标挖掘高效用前 k 项集 (Targeted Mining of Top-k High Utility Itemsets) - 专知论文

会员服务 ·

0

效用 · 阈值 · 算法 · 查询算法 · 高效的算法 ·

2023 年 3 月 25 日

Targeted Mining of Top-k High Utility Itemsets

翻译：目标挖掘高效用前 k 项集

Shan Huang,Wensheng Gan,Jinbao Miao,Xuming Han,Philippe Fournier-Viger

from arxiv, Preprint. 5 figures, 5 tables

Finding high-importance patterns in data is an emerging data mining task known as High-utility itemset mining (HUIM). Given a minimum utility threshold, a HUIM algorithm extracts all the high-utility itemsets (HUIs) whose utility values are not less than the threshold. This can reveal a wealth of useful information, but the precise needs of users are not well taken into account. In particular, users often want to focus on patterns that have some specific items rather than find all patterns. To overcome that difficulty, targeted mining has emerged, focusing on user preferences, but only preliminary work has been conducted. For example, the targeted high-utility itemset querying algorithm (TargetUM) was proposed, which uses a lexicographic tree to query itemsets containing a target pattern. However, selecting the minimum utility threshold is difficult when the user is not familiar with the processed database. As a solution, this paper formulates the task of targeted mining of the top-k high-utility itemsets and proposes an efficient algorithm called TMKU based on the TargetUM algorithm to discover the top-k target high-utility itemsets (top-k THUIs). At the same time, several pruning strategies are used to reduce memory consumption and execution time. Extensive experiments show that the proposed TMKU algorithm has good performance on real and synthetic datasets.

翻译：在数据中找到高重要性的模式是一种新兴的数据挖掘任务，称为高效用项集挖掘（HUIM）。给定一个最小效用阈值，HUIM算法提取所有效用值不低于阈值的高效用项集（HUIs）。这可以揭示大量有用的信息，但用户的具体需求考虑得不充分。特别是，用户往往希望集中于具有某些特定项的模式，而不是找出所有的模式。为了克服这个困难，出现了针对性的挖掘，并且只进行了初步的工作。例如，提出了一种名为TargetUM的针对性高效用项集查询算法，它使用词典树来查询包含目标模式的项集。但是，当用户不熟悉处理的数据库时，选择最小的效用阈值是困难的。作为解决方案，本文提出了目标挖掘前 k 高效用项集（top-k THUIs）的任务，并提出了一种高效的算法TMKU，基于TargetUM算法来发现top-k目标高效用项集。与此同时，采用多种剪枝策略来降低内存消耗和执行时间。广泛的实验证明，所提出的TMKU算法在真实和合成数据集上具有良好的性能。

0

相关内容

【新书】【Metalearning】自动机器学习和数据挖掘的应用，Applications to Automated Machine Learning and Data Mining

【新书】【Metalearning】自动机器学习和数据挖掘的应用，Applications to Automated Machine Learning and Data Mining

专知会员服务

76+阅读 · 2022年3月24日

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

专知会员服务

16+阅读 · 2020年7月20日

【KDD2020】从用户行为中挖掘隐含的相关性反馈，用于Web问题的回答

【KDD2020】从用户行为中挖掘隐含的相关性反馈，用于Web问题的回答

专知会员服务

35+阅读 · 2020年6月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

【KDD2019|讲座推荐】社会用户兴趣挖掘：方法与应用：Social User Interest Mining: Methods and Applications

【KDD2019|讲座推荐】社会用户兴趣挖掘：方法与应用：Social User Interest Mining: Methods and Applications

专知会员服务

41+阅读 · 2019年12月11日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

WWW2022 | Recommendation Unlearning

WWW2022 | Recommendation Unlearning

机器学习与推荐算法

0+阅读 · 2022年6月2日

动手实现推荐系统评价指标

动手实现推荐系统评价指标

机器学习与推荐算法

1+阅读 · 2022年6月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

基于个体化定位的经颅磁刺激治疗抑郁症的疗效及脑机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

面向异构环境的多任务多视图学习算法研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于多维RFID大数据的工业物联网智能车间物流优化方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

几类新型目标罚函数理论与算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于数据挖掘的故障诊断算法

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

IER5基因调节宫颈癌放疗敏感性的功能及其作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向目标最优检测与估计的认知雷达自适应波形设计研究

国家自然科学基金

2+阅读 · 2009年12月31日

适应多类型Insider Attack的入侵检测与精确定位方法的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Optimal Weighted Random Forests

Arxiv

0+阅读 · 2023年5月17日

EFx Budget-Feasible Allocations with High Nash Welfare

Arxiv

0+阅读 · 2023年5月16日

Data Bias Management

Arxiv

0+阅读 · 2023年5月15日

Validity Constraints for Data Analysis Workflows

Arxiv

0+阅读 · 2023年5月15日

A Survey of Federated Evaluation in Federated Learning

Arxiv

0+阅读 · 2023年5月14日

Graph-guided Personalization for Federated Recommendation

Arxiv

0+阅读 · 2023年5月13日

On the Fair Comparison of Optimization Algorithms in Different Machines

Arxiv

0+阅读 · 2023年5月12日

Inference at Scale Significance Testing for Large Search and Recommendation Experiments

Arxiv

0+阅读 · 2023年5月12日

Automated Data Denoising for Recommendation

Arxiv

0+阅读 · 2023年5月11日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

VIP会员

文章信息

相关主题

高效的算法

相关VIP内容

【新书】【Metalearning】自动机器学习和数据挖掘的应用，Applications to Automated Machine Learning and Data Mining

【新书】【Metalearning】自动机器学习和数据挖掘的应用，Applications to Automated Machine Learning and Data Mining

专知会员服务

76+阅读 · 2022年3月24日

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

专知会员服务

16+阅读 · 2020年7月20日

【KDD2020】从用户行为中挖掘隐含的相关性反馈，用于Web问题的回答

【KDD2020】从用户行为中挖掘隐含的相关性反馈，用于Web问题的回答

专知会员服务

35+阅读 · 2020年6月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

【KDD2019|讲座推荐】社会用户兴趣挖掘：方法与应用：Social User Interest Mining: Methods and Applications

【KDD2019|讲座推荐】社会用户兴趣挖掘：方法与应用：Social User Interest Mining: Methods and Applications

专知会员服务

41+阅读 · 2019年12月11日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

（干货书）军事人工智能技术：社会学、文化与伦理视角 | 2025最新258页书籍

数字战场：保护军用无人机免受网络攻击

反无人机：关于“无人机墙”的讨论及反无人机系统时讯更新

日本防卫省下一代信息通信战略

相关资讯

WWW2022 | Recommendation Unlearning

WWW2022 | Recommendation Unlearning

机器学习与推荐算法

0+阅读 · 2022年6月2日

动手实现推荐系统评价指标

动手实现推荐系统评价指标

机器学习与推荐算法

1+阅读 · 2022年6月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Optimal Weighted Random Forests

Arxiv

0+阅读 · 2023年5月17日

EFx Budget-Feasible Allocations with High Nash Welfare

Arxiv

0+阅读 · 2023年5月16日

Data Bias Management

Arxiv

0+阅读 · 2023年5月15日

Validity Constraints for Data Analysis Workflows

Arxiv

0+阅读 · 2023年5月15日

A Survey of Federated Evaluation in Federated Learning

Arxiv

0+阅读 · 2023年5月14日

Graph-guided Personalization for Federated Recommendation

Arxiv

0+阅读 · 2023年5月13日

On the Fair Comparison of Optimization Algorithms in Different Machines

Arxiv

0+阅读 · 2023年5月12日

Inference at Scale Significance Testing for Large Search and Recommendation Experiments

Arxiv

0+阅读 · 2023年5月12日

Automated Data Denoising for Recommendation

Arxiv

0+阅读 · 2023年5月11日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

相关基金

基于个体化定位的经颅磁刺激治疗抑郁症的疗效及脑机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

面向异构环境的多任务多视图学习算法研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于多维RFID大数据的工业物联网智能车间物流优化方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

几类新型目标罚函数理论与算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于数据挖掘的故障诊断算法

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

IER5基因调节宫颈癌放疗敏感性的功能及其作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向目标最优检测与估计的认知雷达自适应波形设计研究

国家自然科学基金

2+阅读 · 2009年12月31日

适应多类型Insider Attack的入侵检测与精确定位方法的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员