State-of-the-art object detectors are fast and accurate, but they require a large amount of well annotated training data to obtain good performance. However, obtaining a large amount of training annotations specific to a particular task, i.e., fine-grained annotations, is costly in practice. In contrast, obtaining common-sense relationships from text, e.g., "a table-lamp is a lamp that sits on top of a table", is much easier. Additionally, common-sense relationships like "on-top-of" are easy to annotate in a task-agnostic fashion. In this paper, we propose a probabilistic model that uses such relational knowledge to transform an off-the-shelf detector of coarse object categories (e.g., "table", "lamp") into a detector of fine-grained categories (e.g., "table-lamp"). We demonstrate that our method, RelDetect, achieves performance competitive to finetuning based state-of-the-art object detector baselines when an extremely low amount of fine-grained annotations is available ($0.2\%$ of entire dataset). We also demonstrate that RelDetect is able to utilize the inherent transferability of relationship information to obtain a better performance ($+5$ mAP points) than the above baselines on an unseen dataset (zero-shot transfer). In summary, we demonstrate the power of using relationships for object detection on datasets where fine-grained object categories can be linked to coarse-grained categories via suitable relationships.
翻译:最先进的天体探测器是快速和准确的, 但是它们需要大量的有详细说明的培训数据才能取得良好的性能。 但是, 获取大量具体任务的培训说明, 即细微的注释, 在实践中成本很高 。 相反, 从文本中获取常识关系, 比如“ 台灯是坐在表格顶部的灯具 ” 。 此外, “ 最上层” 等常识关系, 很容易以任务性能的方式进行批注。 但是, 在本文中, 我们提议一种概率性能模型, 使用这种关系知识将粗略的天体类别( 如“ 表格” “ 封” ) 的现成检测器转换成精密的星体( 例如“ 台灯具” 灯具是位于表格顶端的灯具 。 我们证明, 我们的方法, “ 最顶端”, 能够实现基于州级的天体探测器的天体信号连接。 当一个极低的精确的天体探测关系 5, 也能够使用精确的直径定位数据转换数据 。