Recently, diffusion frameworks have achieved comparable performance with previous state-of-the-art image generation models. Researchers are curious about its variants in discriminative tasks because of its powerful noise-to-image denoising pipeline. This paper proposes DiffusionInst, a novel framework that represents instances as instance-aware filters and formulates instance segmentation as a noise-to-filter denoising process. The model is trained to reverse the noisy groundtruth without any inductive bias from RPN. During inference, it takes a randomly generated filter as input and outputs mask in one-step or multi-step denoising. Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance compared to existing instance segmentation models. We hope our work could serve as a simple yet effective baseline, which could inspire designing more efficient diffusion frameworks for challenging discriminative tasks. Our code is available in https://github.com/chenhaoxing/DiffusionInst.
翻译:最近,传播框架取得了与以往最先进的图像生成模型相似的效绩。 研究人员对其在歧视性任务中的变异性感到好奇, 因为它具有强大的噪音到图像去除管道功能。 本文提议了“ 扩散Inst ” (DifproductionInst), 这是一种代表实例觉悟过滤器的新颖框架, 并将实例分解作为噪声到过滤器的分解过程。 该模型经过培训, 可以在没有RPN任何感化偏差的情况下扭转噪音的地面真实性。 在推断过程中, 它在一步或多步解密中随机生成的过滤器, 作为输入和输出掩码。 COCO 和 LVIS 的广泛实验结果显示, 与现有的分解模型相比, DifproductionInst 实现了竞争性的绩效。 我们希望我们的工作可以作为一个简单而有效的基线, 从而激励设计更高效的质疑歧视任务传播框架。 我们的代码可以在 https://github.com/ chenhaoxing/Diflucinst。