The construction of multiclass classifiers from binary elements is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We discuss two known methods. The first is one vs. all (OVA), for which we prove that the multiclass regret is upper bounded by the sum of binary regrets of the constituent classifiers. The second is hierarchical classification, based on a binary tree. For this method we prove that the multiclass regret is exactly a weighted sum of constituent binary regrets where the weighing is determined by the tree structure. We also introduce a leverage-hierarchical classification method, which potentially yields smaller log-loss and regret. The advantages of these classification methods are demonstrated by simulation on both synthetic and real-life datasets.
翻译:本文研究了从二元元素中构建多级分类器的问题,而业绩则以遗憾量化,这是针对巴耶斯最佳日志损失定义的。我们讨论了两种已知方法。第一是一对二(OVA),为此,我们证明多级遗憾受组成分类器二元遗憾总和的高度约束。第二是根据二元树进行的等级分类。对于这种方法,我们证明多级遗憾正是由树结构决定重量的成份二元遗憾的加权总和。我们还采用了一种杠杆-等级分类法,可能会产生较小的日志损失和遗憾。这些分类法的优点表现在合成和真实数据组的模拟中。