Object Detection, a fundamental computer vision problem, has paramount importance in smart camera systems. However, a truly reliable camera system could be achieved if and only if the underlying object detection component is robust enough across varying imaging conditions (or domains), for instance, different times of the day, adverse weather conditions, etc. In an effort to achieving a reliable camera system, in this paper, we make an attempt to train such a robust detector. Unfortunately, to build a well-performing detector across varying imaging conditions, one would require labeled training images (often in large numbers) from a plethora of corner cases. As manually obtaining such a large labeled dataset may be infeasible, we suggest using synthetic images, to mimic different training image domains. We propose a novel, contrastive learning method to align the latent representations of a pair of real and synthetic images, to make the detector robust to the different domains. However, we found that merely contrasting the embeddings may lead to catastrophic forgetting of the information essential for object detection. Hence, we employ a continual learning based penalty, to alleviate the issue of forgetting, while contrasting the representations. We showcase that our proposed method outperforms a wide range of alternatives to address the extremely challenging, yet under-studied scenario of object detection at night-time.
翻译:在智能相机系统中,一个基本的计算机天体探测问题,在智能相机系统中具有至关重要的意义。然而,只有在基本天体探测组成部分在不同成像条件(或域)中足够强大,例如,不同时段、恶劣天气条件等,才能实现真正可靠的摄像系统。为了实现可靠的摄像系统,我们在本文中试图培养这样一个强大的探测器。不幸的是,为了在各种成像条件下建立一个良好的探测器,人们需要从过多的转角案例中打上标记的培训图像(通常数量众多)。由于人工获取如此庞大的贴标签数据集可能不可行,我们建议使用合成图像模拟不同的培训图像领域。我们提出了一个新的、对比式的学习方法,使一对真实和合成图像的潜在表现与不同的领域相匹配,使探测器强大起来。然而,我们发现,仅仅对比嵌入的探测器可能会导致灾难性地忘记对物体探测至关重要的信息。因此,我们持续采用基于学习的处罚,以缓解遗忘问题,同时对比图像的合成图像,以模拟不同的培训图像领域。我们提出了一种新的、对比性学习方法,在极端的场景场景中,我们展示了一种极具挑战性的方法。