Contrast pattern mining (CPM) is an important and popular subfield of data mining. Traditional sequential patterns cannot describe the contrast information between different classes of data, while contrast patterns involving the concept of contrast can describe the significant differences between datasets under different contrast conditions. Based on the number of papers published in this field, we find that researchers' interest in CPM is still active. Since CPM has many research questions and research methods. It is difficult for new researchers in the field to understand the general situation of the field in a short period of time. Therefore, the purpose of this article is to provide an up-to-date comprehensive and structured overview of the research direction of contrast pattern mining. First, we present an in-depth understanding of CPM, including basic concepts, types, mining strategies, and metrics for assessing discriminative ability. Then we classify CPM methods according to their characteristics into boundary-based algorithms, tree-based algorithms, evolutionary fuzzy system-based algorithms, decision tree-based algorithms, and other algorithms. In addition, we list the classical algorithms of these methods and discuss their advantages and disadvantages. Advanced topics in CPM are presented. Finally, we conclude our survey with a discussion of the challenges and opportunities in this field.
翻译:传统相继模式无法描述不同数据类别之间的对比信息,而对比概念的对比模式则能够描述不同对比条件下数据集之间的巨大差异。根据在这一领域发表的论文数量,我们发现研究人员对CPM的兴趣仍然活跃。由于CPM有许多研究问题和研究方法。实地的新研究人员很难在短时间内了解实地的总体情况。因此,本篇文章的目的是为对比模式采矿的研究方向提供最新、全面和有条理的概览。首先,我们深入了解CPM,包括基本概念、类型、采矿战略和用于评估歧视能力的衡量标准。然后,我们将CPM方法的特性分类为基于边界的算法、基于树木的算法、进化的模糊系统算法、基于决定树的算法和其他算法。此外,我们列出了这些方法的典型算法,并讨论了其优缺点。在CPM调查中,我们提出了一个高级专题。最后,我们提出了实地调查中的挑战。