Interactive segmentation allows users to extract target masks by making positive/negative clicks. Although explored by many previous works, there is still a gap between academic approaches and industrial needs: first, existing models are not efficient enough to work on low power devices; second, they perform poorly when used to refine preexisting masks as they could not avoid destroying the correct part. FocalClick solves both issues at once by predicting and updating the mask in localized areas. For higher efficiency, we decompose the slow prediction on the entire image into two fast inferences on small crops: a coarse segmentation on the Target Crop, and a local refinement on the Focus Crop. To make the model work with preexisting masks, we formulate a sub-task termed Interactive Mask Correction, and propose Progressive Merge as the solution. Progressive Merge exploits morphological information to decide where to preserve and where to update, enabling users to refine any preexisting mask effectively. FocalClick achieves competitive results against SOTA methods with significantly smaller FLOPs. It also shows significant superiority when making corrections on preexisting masks. Code and data will be released at github.com/XavierCHEN34/ClickSEG
翻译:互动区段允许用户通过正面/负面的点击来提取目标面罩。 虽然许多以往的著作都曾探讨过, 学术方法与工业需求之间仍然存在差距: 首先, 现有模型不够有效, 不足以在低电设备上发挥作用; 第二, 当使用现有模型来改进原有的面罩时表现不佳, 因为无法避免销毁正确的部分。 聚焦点通过在局部地区预测和更新面罩, 立即解决了这两个问题。 为了提高效率, 我们将对整个图像的缓慢预测分解成两种关于小作物的快速推论: 目标作物的粗略分解, 以及焦点作物的本地精细化。 要用原有的面罩来使模型工作, 我们制定称为互动面罩校正的子任务, 并提出渐进合并作为解决方案。 进步合并利用形态信息来决定在哪里保存和在哪里更新, 使用户能够有效地改进任何先前存在的面罩。 为了提高效率, 聚焦点点点在使用大得多的FLOPs对SOTA方法取得竞争结果。 它还显示在对原有面罩进行校正时具有显著的优势。 代码和数据将在 GithN. C. 。