Open-world instance segmentation has recently gained significant popularitydue to its importance in many real-world applications, such as autonomous driving, robot perception, and remote sensing. However, previous methods have either produced unsatisfactory results or relied on complex systems and paradigms. We wonder if there is a simple way to obtain state-of-the-art results. Fortunately, we have identified two observations that help us achieve the best of both worlds: 1) query-based methods demonstrate superiority over dense proposal-based methods in open-world instance segmentation, and 2) learning localization cues is sufficient for open world instance segmentation. Based on these observations, we propose a simple query-based method named OpenInst for open world instance segmentation. OpenInst leverages advanced query-based methods like QueryInst and focuses on learning localization cues. Notably, OpenInst is an extremely simple and straightforward framework without any auxiliary modules or post-processing, yet achieves state-of-the-art results on multiple benchmarks. Specifically, in the COCO$\to$UVO scenario, OpenInst achieves a mask AR of 53.3, outperforming the previous best methods by 2.0 AR with a simpler structure. We hope that OpenInst can serve as a solid baselines for future research in this area.
翻译:近年来,开放世界实例分割因其在自动驾驶、机器人感知和遥感等许多实际应用中的重要性而越来越受关注。然而,以前的方法要么产生了不令人满意的结果,要么依赖于复杂的系统和范例。我们想知道是否有一种简单的方法可以获得最先进的结果。幸运的是,我们已经发现了两个有助于实现“简单和先进”的观察结果:(1) 基于查询的方法在开放世界实例分割中表现优于密集建议(proposal)的方法,(2) 学习定位线索对于开放世界实例分割足以。基于这些观察结果,我们提出了一种名为 OpenInst 的简单基于查询的方法用于开放世界实例分割。OpenInst 利用了 QueryInst 这样的先进查询方法,并专注于学习定位线索。值得注意的是,OpenInst 是一个无需任何辅助模块或后处理的极其简单和直接的框架,但在多个基准测试中实现了最先进的结果。具体而言,在 COCO$\rightarrow$UVO 场景中,OpenInst 的掩码 AR 达到了 53.3,比以前的最佳方法高出了 2.0 AR,且结构更简单。我们希望 OpenInst 可以成为这个领域未来研究的稳定基准。