Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge. The experimental results suggest that our method yields promising results for ZSD.
翻译:物体探测被视为计算机视觉中最具挑战性的问题之一,因为它要求正确预测图像中物体的类别和位置。在本研究中,我们定义了一个更困难的情景,即零射物体探测(ZSD),因为有些目标物体类别没有视觉培训数据。我们提出了解决ZSD问题的新办法,在这种问题上,结合探测框架使用嵌入的混凝体组合。为评价ZSD方法,我们建议用时装-MNIST图像构建一个简单的数据集,并针对Pascal VOC探测挑战进行定制零射分解。实验结果表明,我们的方法为ZSD带来有希望的结果。