Most classification models treat different object classes in parallel and the misclassifications between any two classes are treated equally. In contrast, human beings can exploit high-level information in making a prediction of an unknown object. Inspired by this observation, the paper proposes a super-class guided network (SGNet) to integrate the high-level semantic information into the network so as to increase its performance in inference. SGNet takes two-level class annotations that contain both super-class and finer class labels. The super-classes are higher-level semantic categories that consist of a certain amount of finer classes. A super-class branch (SCB), trained on super-class labels, is introduced to guide finer class prediction. At the inference time, we adopt two different strategies: Two-step inference (TSI) and direct inference (DI). TSI first predicts the super-class and then makes predictions of the corresponding finer class. On the other hand, DI directly generates predictions from the finer class branch (FCB). Extensive experiments have been performed on CIFAR-100 and MS COCO datasets. The experimental results validate the proposed approach and demonstrate its superior performance on image classification and object detection.
翻译:相比之下,人类可以在预测未知对象时利用高层次信息。受这一观察的启发,本文件建议建立一个超级级引导网络(SGNet),将高层次语义信息纳入网络,以提高其推论性能。SGNet首先采用包含超级级和较优级标签的两级级说明,而超级类则属于高层次语义类,由一定数量的精细类组成。一个受过超级类标签培训的超级类分支(SCB)被引入来指导精细级预测。在推论时,我们采取两种不同的战略:两步推论(TSI)和直接推论(DI)。TSI首先预测超级类,然后对相应的精细级作出预测。另一方面,DI直接从精细级分支(FCB)生成预测。在CIFAR-100和MSCO图像分类上进行了广泛的实验,展示了其高级测试结果。