Object detectors often experience a drop in performance when new environmental conditions are insufficiently represented in the training data. This paper studies how to automatically fine-tune a pre-existing object detector while exploring and acquiring images in a new environment without relying on human intervention, i.e., in a self-supervised fashion. In our setting, an agent initially explores the environment using a pre-trained off-the-shelf detector to locate objects and associate pseudo-labels. By assuming that pseudo-labels for the same object must be consistent across different views, we devise a novel mechanism for producing refined predictions from the consensus among observations. Our approach improves the off-the-shelf object detector by 2.66% in terms of mAP and outperforms the current state of the art without relying on ground-truth annotations.
翻译:当培训数据中未充分体现新的环境条件时,物体探测器的性能往往会下降。 本文研究如何在不依赖人类干预的情况下,即以自我监督的方式,自动微调在新环境中探索和获取图像时,在不依赖人类干预的情况下,即在不依赖人类自我监督的情况下,对原有物体探测器进行自动微调。 在我们的环境下, 一种代理人最初利用预先训练的现成探测器来探索环境, 以定位物体和相关的假标签。 假设同一物体的假标签必须在不同的观点中保持一致, 我们设计了一种新机制, 从观测的共识中产生精细的预测。 我们的方法使现成的物体探测器在 mAP 方面改进2.66%, 并且不依靠地面的注释, 超越目前艺术状态。