Label tree-based algorithms are widely used to tackle multi-class and multi-label problems with a large number of labels. We focus on a particular subclass of these algorithms that use probabilistic classifiers in the tree nodes. Examples of such algorithms are hierarchical softmax (HSM), designed for multi-class classification, and probabilistic label trees (PLTs) that generalize HSM to multi-label problems. If the tree structure is given, learning of PLT can be solved with provable regret guaranties [Wydmuch et.al. 2018]. However, to find a tree structure that results in a PLT with a low training and prediction computational costs as well as low statistical error seems to be a very challenging problem, not well-understood yet. In this paper, we address the problem of finding a tree structure that has low computational cost. First, we show that finding a tree with optimal training cost is NP-complete, nevertheless there are some tractable special cases with either perfect approximation or exact solution that can be obtained in linear time in terms of the number of labels $m$. For the general case, we obtain $O(\log m)$ approximation in linear time too. Moreover, we prove an upper bound on the expected prediction cost expressed in terms of the expected training cost. We also show that under additional assumptions the prediction cost of a PLT is $O(\log m)$.
翻译:标签树基算法(HSM)被广泛用于解决多级和多标签问题,使用大量标签。我们注重于在树节节中使用概率分类器的这些算法的特殊亚类。这种算法的例子有:为多级分类设计的等级软马克斯(HSM)和将HSM普遍化为多标签问题的概率标签树(PLTs)。如果树结构被赋予,学习PLT可以用可察觉到的遗憾保证[Widbuch 和al.al. 2018]来解决。然而,要找到一种可导致PLT采用低培训和预测计算成本和低统计错误的树结构。在本文中,我们处理的是找到树结构的问题,将HSMM(PLT)变成多标签问题。首先,我们表明,找到一棵具有最佳培训成本的树是NP-完成的,然而,有些特殊案例可以精确的近似或精确的解决方案,可以在直线时间里得到,以低培训和预测成本计算的PLTT值计算成本, 也证明,我们预计成本上限。