Many classification problems consider classes that form a hierarchy. Classifiers that are aware of this hierarchy may be able to make confident predictions at a coarse level despite being uncertain at the fine-grained level. While it is generally possible to vary the granularity of predictions using a threshold at inference time, most contemporary work considers only leaf-node prediction, and almost no prior work has compared methods at multiple operating points. We present an efficient algorithm to produce operating characteristic curves for any method that assigns a score to every class in the hierarchy. Applying this technique to evaluate existing methods reveals that top-down classifiers are dominated by a naive flat softmax classifier across the entire operating range. We further propose two novel loss functions and show that a soft variant of the structured hinge loss is able to significantly outperform the flat baseline. Finally, we investigate the poor accuracy of top-down classifiers and demonstrate that they perform relatively well on unseen classes. Code is available online at https://github.com/jvlmdr/hiercls.
翻译:许多分类问题都考虑到构成等级的类别。 了解这一等级的分类者可能能够在粗糙的层次上作出自信的预测, 尽管在细微的分层上还不确定。 虽然通常有可能使用推论时间的阈值来改变预测的颗粒性, 但大多数当代工作只考虑叶节预测, 几乎没有以前的工作在多个操作点上比较方法。 我们为给等级中的每一等级分配得分的任何方法提供一个高效的算法来生成操作特征曲线。 应用这一技术来评估现有方法表明, 上至下分类者在整个操作范围内被一个天性平坦的软式软体分类器主导。 我们进一步提议两个新的损失功能, 并表明结构型链损失的软变量能够大大超过统一基线。 最后, 我们调查了上至下分类者的低精度, 并表明他们在不可见的分类中表现相对较好。 代码可在 https://github.com/jvldr/hiercls上查阅 。