In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. Un-fortunately, the majority of classification datasets do not come pre-equipped with a hierarchical structure and classical flat classifiers must be employed. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. The class of methods follows the structure of first clustering the conditional distributions and subsequently using a hierarchical classifier with the induced hierarchy. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications.
翻译:在绝对标签遵循自然等级的应用程序中,利用标签结构的分类方法往往优于非自然等级的分类方法。不幸的是,大多数分类数据集并非事先配备了等级结构,必须使用古典的平板分类器。在本文中,我们调查了一类方法,这些方法促使一种等级结构同样改进了对平板分类器的分类性能。这些方法的类别遵循了首先组合有条件分布的结构,然后使用与引致的等级结构的等级分类器。我们显示了在发现潜在等级结构、提高原则模拟设置和三种实际数据应用的准确性方面方法类别的有效性。