Interactive segmentation has recently been explored to effectively and efficiently harvest high-quality segmentation masks by iteratively incorporating user hints. While iterative in nature, most existing interactive segmentation methods tend to ignore the dynamics of successive interactions and take each interaction independently. We here propose to model iterative interactive image segmentation with a Markov decision process (MDP) and solve it with reinforcement learning (RL) where each voxel is treated as an agent. Considering the large exploration space for voxel-wise prediction and the dependence among neighboring voxels for the segmentation tasks, multi-agent reinforcement learning is adopted, where the voxel-level policy is shared among agents. Considering that boundary voxels are more important for segmentation, we further introduce a boundary-aware reward, which consists of a global reward in the form of relative cross-entropy gain, to update the policy in a constrained direction, and a boundary reward in the form of relative weight, to emphasize the correctness of boundary predictions. To combine the advantages of different types of interactions, i.e., simple and efficient for point-clicking, and stable and robust for scribbles, we propose a supervoxel-clicking based interaction design. Experimental results on four benchmark datasets have shown that the proposed method significantly outperforms the state-of-the-arts, with the advantage of fewer interactions, higher accuracy, and enhanced robustness.
翻译:交互式分割最近被利用来有效且高效地应用用户提示进行优质分割掩膜的生成。在迭代性质下,大部分交互式分割方法倾向于忽略连续交互的动态并独立看待每个交互。本文提出用马尔可夫决策过程 (MDP) 建模迭代交互式图像分割,并采用强化学习 (RL) 对其进行求解,其中每个像素视为一个智能体。考虑到像素级预测的探索空间很大,且对于分割任务的相邻像素存在依赖关系,我们采用多智能体强化学习,其中像素级策略在智能体之间共享。考虑到边界像素对于分割更为重要,我们引入了边界感知奖励,以相对交叉熵增益形式构成全局奖励,进行策略的约束更新,同时边界奖励采取相对权重形式突出边界预测的正确性。为了兼具点选与涂鸦两种交互方式的优点 (分别为简便易行与稳定可靠),我们提出了基于超像素点选的交互设计。对于四个基准数据集的实验结果表明,本文提出的算法显著优于现有方法,且具有更少的交互次数、更高的准确度和更强的鲁棒性。