从传播模型中去除概念</s> (Erasing Concepts from Diffusion Models)

Motivated by recent advancements in text-to-image diffusion, we study erasure of specific concepts from the model's weights. While Stable Diffusion has shown promise in producing explicit or realistic artwork, it has raised concerns regarding its potential for misuse. We propose a fine-tuning method that can erase a visual concept from a pre-trained diffusion model, given only the name of the style and using negative guidance as a teacher. We benchmark our method against previous approaches that remove sexually explicit content and demonstrate its effectiveness, performing on par with Safe Latent Diffusion and censored training. To evaluate artistic style removal, we conduct experiments erasing five modern artists from the network and conduct a user study to assess the human perception of the removed styles. Unlike previous methods, our approach can remove concepts from a diffusion model permanently rather than modifying the output at the inference time, so it cannot be circumvented even if a user has access to model weights. Our code, data, and results are available at https://erasing.baulab.info/

翻译：以最近在文字到图像传播方面的进步为动力,我们研究从模型的重量中消除具体概念。虽然稳定的传播在制作明确或现实的艺术作品方面显示出希望,但它引起了人们对其被滥用的潜在可能性的关切。我们建议了一种微调方法,这种微调方法可以将视觉概念从经过培训的传播模式中抹去,仅以这种风格的名称为单位,并使用负面指导作为教师。我们用以往的方法来衡量我们的方法,这些方法消除了明显的性内容并展示了其有效性,在与安全低端传播和受审查的培训相同的情况下进行。为了评估艺术风格的删除,我们进行了实验,将5名现代艺术家从网络中除去,并进行了用户研究,以评估人类对被删除的艺术风格的看法。与以往的方法不同,我们的方法可以永久地从传播模式中去除概念,而不是在推论时间修改输出,因此即使用户有机会使用模型重量,也无法绕过。我们的代码、数据和结果可在 https://erasing.baulab.info/上查阅。</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/