忘记我：学习在文本图像扩散模型中的遗忘 (Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models)

The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry. The significant advances in text-to-image generation techniques have prompted global discussions on privacy, copyright, and safety, as numerous unauthorized personal IDs, content, artistic creations, and potentially harmful materials have been learned by these models and later utilized to generate and distribute uncontrolled content. To address this challenge, we propose \textbf{Forget-Me-Not}, an efficient and low-cost solution designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds, without impairing its ability to generate other content. Alongside our method, we introduce the \textbf{Memorization Score (M-Score)} and \textbf{ConceptBench} to measure the models' capacity to generate general concepts, grouped into three primary categories: ID, object, and style. Using M-Score and ConceptBench, we demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts. Furthermore, Forget-Me-Not offers two practical extensions: a) removal of potentially harmful or NSFW content, and b) enhancement of model accuracy, inclusion and diversity through \textbf{concept correction and disentanglement}. It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution. To encourage future research in this critical area and promote the development of safe and inclusive generative models, we will open-source our code and ConceptBench at \href{https://github.com/SHI-Labs/Forget-Me-Not}{https://github.com/SHI-Labs/Forget-Me-Not}.

翻译：深度学习模型的遗忘问题一度只是学术界的关注点，但现在已成为行业中普遍存在的问题。文本图像生成技术的重大进展已引发对隐私、版权和安全的全球讨论，因为这些模型已学习了许多未经授权的个人ID、内容、艺术创作和潜在有害材料，并随后用于生成和分发不受控制的内容。为解决这一挑战，我们提出了一种高效、低成本的解决方案——忘记我（Forget-Me-Not），旨在从一个已配置好的文本图像模型中安全地删除指定的ID、对象或风格，只需不到30秒即可完成，而不会影响其生成其他内容的能力。除了我们的方法，我们还引入了“记忆分数（M-Score）”和“ConceptBench”，以衡量模型生成一般概念的能力，这些概念分为三个主要类别：ID、对象和风格。利用M-Score和ConceptBench，我们证明了Forget-Me-Not可以有效地除去目标概念，同时保持模型在其他概念上的性能。此外，Forget-Me-Not还提供了两个实用的扩展功能：a)删除潜在有害或不安全的内容，b)通过“概念校正和分离”提高模型的准确性、包容性和多样性。它还可以作为稳定扩散的轻量级模型补丁，允许概念操作和方便的分发。为了鼓励未来在这一关键领域的研究，促进安全和包容性生成模型的发展，我们将在 \href {https://github.com/SHI-Labs/Forget-Me-Not}{https://github.com/SHI-Labs/Forget-Me-Not} 上公开我们的代码和ConceptBench。