基于树的动态分类链 (Tree-Based Dynamic Classifier Chains)

from arxiv, Preprint of version accepted at Machine Learning Journal, Special Issue Discovery Science. Code available under https://github.com/keelm/XDCC

Classifier chains are an effective technique for modeling label dependencies in multi-label classification. However, the method requires a fixed, static order of the labels. While in theory, any order is sufficient, in practice, this order has a substantial impact on the quality of the final prediction. Dynamic classifier chains denote the idea that for each instance to classify, the order in which the labels are predicted is dynamically chosen. The complexity of a naive implementation of such an approach is prohibitive, because it would require to train a sequence of classifiers for every possible permutation of the labels. To tackle this problem efficiently, we propose a new approach based on random decision trees which can dynamically select the label ordering for each prediction. We show empirically that a dynamic selection of the next label improves over the use of a static ordering under an otherwise unchanged random decision tree model. % and experimental environment. In addition, we also demonstrate an alternative approach based on extreme gradient boosted trees, which allows for a more target-oriented training of dynamic classifier chains. Our results show that this variant outperforms random decision trees and other tree-based multi-label classification methods. More importantly, the dynamic selection strategy allows to considerably speed up training and prediction.

翻译：分类链是一种在多标签分类中模拟标签依赖性的有效技术。但是, 这种方法需要固定的固定标签顺序。虽然理论上, 任何顺序都足够, 但实际上, 任何顺序都足以对最终预测的质量产生实质性影响。动态分类链表示, 对于每个类别来说, 标签的预测顺序是动态选择的。如此简单地实施这种方法的复杂性是令人望而却步的, 因为需要为每个可能的标签变换而训练一系列分类员。为了有效解决这一问题, 我们建议了一种基于随机决定树的新方法, 可以动态地为每项预测选择订购标签。我们从经验上显示, 动态地选择下一个标签会比在原为不变的随机决定树模型下使用静排序的更好。% 和实验环境。此外, 我们还展示了一种基于极端梯度振动树的替代方法, 从而可以对动态分类链进行更有针对性的培训。我们的结果显示, 这种变式会超越随机决定树和其他基于树的多标签的分类方法, 能够大大的动态地使选择战略成为。更重要的是, 。