In this work, we propose a method to efficiently compute label posteriors of a base flat classifier in the presence of few validation examples within a bottom-up hierarchical inference framework. A stand-alone validation set (not used to train the base classifier) is preferred for posterior estimation to avoid overfitting the base classifier, however a small validation set limits the number of features one can effectively use. We propose a simple, yet robust, logit vector compression approach based on generalized logits and label confusions for the task of label posterior estimation within the context of hierarchical classification. Extensive comparative experiments with other compression techniques are provided across multiple sized validation sets, and a comparison with related hierarchical classification approaches is also conducted. The proposed approach mitigates the problem of not having enough validation examples for reliable posterior estimation while maintaining strong hierarchical classification performance.
翻译:在这项工作中,我们提出一种方法,在自下而上等级推论框架内,在少数验证实例存在的情况下,有效计算一个基平级分类师的标签后部;在后部估算时,更倾向于采用独立验证(不用于培训基准分类师),以避免过重配置基分类师,然而,一个小的验证规定限制了人们可以有效使用的特征的数量;我们提议一种简单、但稳健的逻辑矢量压缩方法,基于普遍登录和标签混淆,在等级分类范围内,在标签后部估测任务上采用简单、稳健的逻辑矢量压缩方法;在多个规模的验证组中,与其他压缩技术进行广泛的比较试验,并与相关的等级分类方法进行比较;拟议办法减轻了在保持强有力的等级分类性能的同时,没有足够的验证实例进行可靠的远端估测的问题。