Despite significant progress in object categorization, in recent years, a number of important challenges remain; mainly, the ability to learn from limited labeled data and to recognize object classes within large, potentially open, set of labels. Zero-shot learning is one way of addressing these challenges, but it has only been shown to work with limited sized class vocabularies and typically requires separation between supervised and unsupervised classes, allowing former to inform the latter but not vice versa. We propose the notion of vocabulary-informed learning to alleviate the above mentioned challenges and address problems of supervised, zero-shot, generalized zero-shot and open set recognition using a unified framework. Specifically, we propose a weighted maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms. Distance constraints ensure that labeled samples are projected closer to their correct prototypes, in the embedding space, than to others. We illustrate that resulting model shows improvements in supervised, zero-shot, generalized zero-shot, and large open set recognition, with up to 310K class vocabulary on Animal with Attributes and ImageNet datasets.
翻译:尽管近年来在目标分类方面取得了显著进展,但仍存在若干重大挑战;主要是,能够从有限的标签数据中学习,并在庞大的、可能开放的标签群中识别对象类别。零光学习是应对这些挑战的一种方法,但仅显示它与规模有限的分类库合作,通常需要将受监督的和不受监督的分类分类分开,允许前者向后者通报,而不是相反。我们提议采用基于词汇的知情学习概念,以缓解上述挑战,并解决使用统一框架进行监管、零光、普遍零光和开放的识别的问题。具体地说,我们提议了一个基于语义的多重识别加权最大边际框架,其中纳入与(受监督的和未受监督的)词汇组的距离限制。远程限制确保标签样本的预测更接近其正确的原型,在嵌入空间,而不是其它地方。我们指出,由此形成的模型显示,在监督、零光、普遍零光、零光、大开放的零光谱和大量公开的识别方面有所改进,并有最多310K级的关于动物和图像网数据组。