Click-based interactive segmentation (IS) aims to extract the target objects under user interaction. For this task, most of the current deep learning (DL)-based methods mainly follow the general pipelines of semantic segmentation. Albeit achieving promising performance, they do not fully and explicitly utilize and propagate the click information, inevitably leading to unsatisfactory segmentation results, even at clicked points. Against this issue, in this paper, we propose to formulate the IS task as a Gaussian process (GP)-based pixel-wise binary classification model on each image. To solve this model, we utilize amortized variational inference to approximate the intractable GP posterior in a data-driven manner and then decouple the approximated GP posterior into double space forms for efficient sampling with linear complexity. Then, we correspondingly construct a GP classification framework, named GPCIS, which is integrated with the deep kernel learning mechanism for more flexibility. The main specificities of the proposed GPCIS lie in: 1) Under the explicit guidance of the derived GP posterior, the information contained in clicks can be finely propagated to the entire image and then boost the segmentation; 2) The accuracy of predictions at clicks has good theoretical support. These merits of GPCIS as well as its good generality and high efficiency are substantiated by comprehensive experiments on several benchmarks, as compared with representative methods both quantitatively and qualitatively.
翻译:在用户互动中,基于点击的互动区段(IS)旨在提取用户互动下的目标对象。对于这一任务,目前基于深度学习(DL)的大多数方法主要遵循语义区段的一般管道。尽管取得了有希望的性能,但它们并没有充分和明确地利用和传播点击信息,不可避免地导致不满意的区段结果,即使是在点击点,也不可避免地导致不满意的区段结果。针对这一问题,我们在本文件中提议将IS任务设计成一个基于Gaussian进程(GP)的像素智慧二进制分类模型。为了解决这一模型,我们使用基于深度学习(DL)的大多数基于语义部分的方法,以数据驱动的方式将棘手的GP后台相近接近,然后将近似GP的后台相框变成双空表,以便以线性复杂度进行高效取样。我们相应地构建了一个名为GPCIS的GPCIS分类框架,与更灵活度的深内核学习机制相结合。拟议的GPSIS的主要特性在于:1)在后来的GP后级后,在明确指导下,以数据级标定出的变异性推法中所含信息,其精度的精度的精度支持,其精度,作为一般的精度的精度的精度,其精度,其精度的精度,其精度可以点击的精度,其精度,其精度的精度可以点击的精度,其精度,其精度,其精度可推的精度可推的精度可推的精度可推。</s>