We explore object detection with two attributes: color and material. The task aims to simultaneously detect objects and infer their color and material. A straight-forward approach is to add attribute heads at the very end of a usual object detection pipeline. However, we observe that the two goals are in conflict: Object detection should be attribute-independent and attributes be largely object-independent. Features computed by a standard detection network entangle the category and attribute features; we disentangle them by the use of a two-stream model where the category and attribute features are computed independently but the classification heads share Regions of Interest (RoIs). Compared with a traditional single-stream model, our model shows significant improvements over VG-20, a subset of Visual Genome, on both supervised and attribute transfer tasks.
翻译:我们用两个属性来探索物体探测:颜色和材料。任务旨在同时探测物体并推断其颜色和材料。一个直向前进的方法是在通常的物体探测管道的尽头添加属性头。然而,我们观察到这两个目标有冲突:物体探测应独立于属性,属性基本上独立于对象。由标准检测网络计算,将类别和属性特征相互缠绕;我们通过使用双流模型来解开它们,该模型是独立计算类别和属性特征,但分类负责人分享兴趣区域。与传统的单一流模型相比,我们的模型显示,在监督和属性转移任务方面,VG-20是视觉基因组的子集,比VG-20有显著的改进。