Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions. Since attributes are often related to specific parts of objects, many recent works focus on discovering discriminative regions. However, these methods usually require additional complex part detection modules or attention mechanisms. In this paper, 1) we show that common ZSL backbones (without explicit attention nor part detection) can implicitly localize attributes, yet this property is not exploited. 2) Exploiting it, we then propose SELAR, a simple method that further encourages attribute localization, surprisingly achieving very competitive generalized ZSL (GZSL) performance when compared with more complex state-of-the-art methods. Our findings provide useful insight for designing future GZSL methods, and SELAR provides an easy to implement yet strong baseline.
翻译:零点学习(ZSL)的目的是通过基于属性的描述,利用各种关系来区分隐蔽类别中的图像,从而区分隐蔽类别中的图像;由于属性往往与对象的具体部分有关,最近许多工作的重点是发现歧视区域;然而,这些方法通常需要额外复杂的部分检测模块或关注机制;在本文件中,1)我们表明,共同的ZSL骨干(没有明确关注或部分检测)可以隐含地将属性本地化,但这种属性未被开发。 2)开发该属性时,我们然后建议SELAR,这是一种简单的方法,进一步鼓励将本地化归为一种属性,令人惊讶的是,与更为复杂的先进方法相比,它实现了非常具有竞争力的普遍ZSL(GZSL)性能;我们的调查结果为设计未来的GZSL方法提供了有用的洞察,而SELAR则提供了易于实施但又强有力的基线。