As a very popular multi-label classification method, Classifiers Chain has recently been widely applied to many multi-label classification tasks. However, existing Classifier Chains methods are difficult to model and exploit the underlying dependency in the label space, and often suffer from the problems of poorly ordered chain and error propagation. In this paper, we present a three-phase augmented Classifier Chains approach based on co-occurrence analysis for multi-label classification. First, we propose a co-occurrence matrix method to model the underlying correlations between a label and its precedents and further determine the head labels of a chain. Second, we propose two augmented strategies of optimizing the order of labels of a chain to approximate the underlying label correlations in label space, including Greedy Order Classifier Chain and Trigram Order Classifier Chain. Extensive experiments were made over six benchmark datasets, and the experimental results show that the proposed augmented CC approaches can significantly improve the performance of multi-label classification in comparison with CC and its popular variants of Classifier Chains, in particular maintaining lower computational costs while achieving superior performance.
翻译:作为非常受欢迎的多标签分类方法,分类链最近被广泛应用于许多多标签分类任务,然而,现有的分类链方法难以建模和利用标签空间的基本依赖性,而且往往受到订购链不畅和错误传播等问题的影响。在本文件中,我们介绍了基于对多标签分类的共同迭生分析的三阶段强化分类链方法。首先,我们提出了一个共同使用矩阵方法,以模拟标签与其先例之间的内在关联性,并进一步确定一个分类链的头标签。第二,我们提出了两个强化战略,即优化一个链的标签顺序,以近似标签空间的基本标签相关性,包括腐蚀性分类链和三角分类链。在六个基准数据集上进行了广泛的实验,实验结果表明,拟议的扩大分类方法可以大大改进多标签分类的性能,与CC及其分类链的流行变体相比,特别是保持较低的计算成本,同时实现更高的性能。