Deep learning-based object proposal methods have enabled significant advances in many computer vision pipelines. However, current state-of-the-art proposal networks use a closed-world assumption, meaning they are only trained to detect instances of the training classes while treating every other region as background. This style of solution fails to provide high recall on out-of-distribution objects, rendering it inadequate for use in realistic open-world applications where novel object categories of interest may be observed. To better detect all objects, we propose a classification-free Self-Trained Proposal Network (STPN) that leverages a novel self-training optimization strategy combined with dynamically weighted loss functions that account for challenges such as class imbalance and pseudo-label uncertainty. Not only is our model designed to excel in existing optimistic open-world benchmarks, but also in challenging operating environments where there is significant label bias. To showcase this, we devise two challenges to test the generalization of proposal models when the training data contains (1) less diversity within the labeled classes, and (2) fewer labeled instances. Our results show that STPN achieves state-of-the-art novel object generalization on all tasks.
翻译:深层次的基于学习的客体建议方法使许多计算机视觉管道取得了显著进步。然而,目前最先进的建议网络使用封闭世界的假设,这意味着它们仅受过训练以探测培训课程的事例,而将其他每个区域作为背景对待。这种解决办法方式未能对分配之外的物体进行高度的回顾,使其不足以用于现实的开放世界应用中,在这些应用中,可以观察到新的兴趣对象类别。为了更好地检测所有对象,我们提议了一个无分类的自我培训建议网络(STPN),利用一种新型的自我培训优化战略,结合动态加权损失功能来应对诸如阶级不平衡和伪标签不确定性等挑战。我们的模型不仅旨在突出现有乐观的开放世界基准,而且还在具有重大标签偏差的挑战性的运作环境中。为了展示这一点,我们设计了两个挑战,在培训数据包含(1) 标签类内的多样性较小,以及(2) 标签式实例较少时,检验建议模式的通用性。我们的结果显示STPN在所有任务上实现了最先进的新对象通用。