One of the biggest problems in itemset mining is the requirement of developing a data structure or algorithm, every time a user wants to extract a different type of itemsets. To overcome this, we propose a method, called Generic Itemset Mining based on Reinforcement Learning (GIM-RL), that offers a unified framework to train an agent for extracting any type of itemsets. In GIM-RL, the environment formulates iterative steps of extracting a target type of itemsets from a dataset. At each step, an agent performs an action to add or remove an item to or from the current itemset, and then obtains from the environment a reward that represents how relevant the itemset resulting from the action is to the target type. Through numerous trial-and-error steps where various rewards are obtained by diverse actions, the agent is trained to maximise cumulative rewards so that it acquires the optimal action policy for forming as many itemsets of the target type as possible. In this framework, an agent for extracting any type of itemsets can be trained as long as a reward suitable for the type can be defined. The extensive experiments on mining high utility itemsets, frequent itemsets and association rules show the general effectiveness and one remarkable potential (agent transfer) of GIM-RL. We hope that GIM-RL opens a new research direction towards learning-based itemset mining.
翻译:开发数据结构或算法的要求, 每次用户想要提取不同种类的物品时, 都需要开发数据结构或算法。 为了克服这一点, 我们提议一种方法, 叫做基于强化学习的通用物品集采矿( GIM- RL), 提供一种统一框架, 用于培训用于提取任何种类物品的代理商。 在 GIM- RL 中, 环境会制定从数据集中提取目标项目类型的迭接步骤。 每一步, 一个代理商都会采取行动, 将某个项目增减到或从当前项目集中移除, 然后从环境中获得一种奖励, 表示该行动产生的物品集与目标类型何等相关。 通过多种行动获得各种奖励的多个试用和试用步骤, 该代理商将获得最大的累积奖励, 以便获得最佳的行动政策, 从一个数据集中提取一个目标种类的物品。 在这个框架中, 一个用于提取任何种类物品的代理商只要适合该类型的奖赏, 就可以从环境中获得奖励。 在高用途物品的开采中进行广泛的实验, GIM- 经常项目和关联规则显示一个引人注目的GIM- 转让。