高度个性化文本嵌入实现可靠扩散的图像操作 (Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion) - 专知论文

会员服务 ·

0

文本嵌入 · 嵌入 · 操作 · 嵌入空间 · 复杂语义 ·

2023 年 4 月 19 日

Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

翻译：高度个性化文本嵌入实现可靠扩散的图像操作

Inhwa Han,Serin Yang,Taesung Kwon,Jong Chul Ye

Diffusion models have shown superior performance in image generation and manipulation, but the inherent stochasticity presents challenges in preserving and manipulating image content and identity. While previous approaches like DreamBooth and Textual Inversion have proposed model or latent representation personalization to maintain the content, their reliance on multiple reference images and complex training limits their practicality. In this paper, we present a simple yet highly effective approach to personalization using highly personalized (HiPer) text embedding by decomposing the CLIP embedding space for personalization and content manipulation. Our method does not require model fine-tuning or identifiers, yet still enables manipulation of background, texture, and motion with just a single image and target text. Through experiments on diverse target texts, we demonstrate that our approach produces highly personalized and complex semantic image edits across a wide range of tasks. We believe that the novel understanding of the text embedding space presented in this work has the potential to inspire further research across various tasks.

翻译：扩散模型在图像生成和操作中表现出卓越性能，但固有的随机性在保留和操作图像内容和身份方面存在挑战。尽管之前的方法如DreamBooth和文本反演提出了模型或潜在表示个性化来维护内容，但它们对多个参考图像和复杂训练的依赖限制了它们的实用性。在本文中，我们提出了一种简单但高效的个性化方法，即使用高度个性化（HiPer）文本嵌入通过分解CLIP嵌入空间进行个性化和内容操作。我们的方法不需要模型微调或标识符，但仍能够仅通过单个图像和目标文本就实现背景、纹理和运动的操作。通过在不同的目标文本上进行实验，我们证明了我们的方法在各种任务中产生高度个性化和复杂语义图像编辑。我们相信，本文所呈现的对文本嵌入空间的新颖理解具有启发各种任务的潜力。

0

相关内容

文本嵌入

DiffRec: 扩散推荐模型（SIGIR'23）

DiffRec: 扩散推荐模型（SIGIR'23）

专知会员服务

48+阅读 · 2023年4月16日

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

专知会员服务

54+阅读 · 2022年12月10日

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

专知会员服务

22+阅读 · 2022年3月18日

如何用好对比学习？CVPR2021谷歌ChenTing《自监督视觉表示学习》报告，附视频与Slides

如何用好对比学习？CVPR2021谷歌ChenTing《自监督视觉表示学习》报告，附视频与Slides

专知会员服务

38+阅读 · 2021年6月21日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CVPR2020-莫斯科Yandex】双曲图像嵌入，Hyperbolic Image Embeddings

【CVPR2020-莫斯科Yandex】双曲图像嵌入，Hyperbolic Image Embeddings

专知会员服务

40+阅读 · 2020年4月12日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

图像和文本的融合表示学习——Text2Image和Image2Text

图像和文本的融合表示学习——Text2Image和Image2Text

专知

125+阅读 · 2018年6月11日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

非线性Schrödinger方程孤立子和怪波的数值方法

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂语义的个性化图像集摘要研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度学习的音乐特征学习与分类

国家自然科学基金

7+阅读 · 2014年12月31日

鹿茸多肽诱导间充质干细胞定向分化为神经元的机理研究

国家自然科学基金

1+阅读 · 2014年12月31日

MSC-L的抗肿瘤基质细胞作用及其抑癌功效的研究

国家自然科学基金

0+阅读 · 2012年12月31日

空间编码可控的快速MRI高分辨率图像稀疏重建

国家自然科学基金

1+阅读 · 2012年12月31日

图像语义自动文本描述技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于多尺度边缘感知的图像平滑和分层编辑研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向普适环境的流媒体柔性机理与调度策略研究

国家自然科学基金

0+阅读 · 2009年12月31日

过渡金属催化卤代芳烃对芳醛的Barbier类型反应研究

国家自然科学基金

0+阅读 · 2009年12月31日

HeadSculpt: Crafting 3D Head Avatars with Text

Arxiv

0+阅读 · 2023年6月5日

Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion

Arxiv

0+阅读 · 2023年6月5日

Weakly-Supervised Conditional Embedding for Referred Visual Search

Arxiv

0+阅读 · 2023年6月5日

Zero shot framework for satellite image restoration

Arxiv

0+阅读 · 2023年6月5日

ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation

Arxiv

0+阅读 · 2023年6月5日

Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

Arxiv

0+阅读 · 2023年6月5日

Diffusion Self-Guidance for Controllable Image Generation

Arxiv

1+阅读 · 2023年6月2日

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation

Arxiv

0+阅读 · 2023年6月1日

Speech inpainting: Context-based speech synthesis guided by video

Arxiv

0+阅读 · 2023年6月1日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

DiffRec: 扩散推荐模型（SIGIR'23）

DiffRec: 扩散推荐模型（SIGIR'23）

专知会员服务

48+阅读 · 2023年4月16日

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

专知会员服务

54+阅读 · 2022年12月10日

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

专知会员服务

22+阅读 · 2022年3月18日

如何用好对比学习？CVPR2021谷歌ChenTing《自监督视觉表示学习》报告，附视频与Slides

如何用好对比学习？CVPR2021谷歌ChenTing《自监督视觉表示学习》报告，附视频与Slides

专知会员服务

38+阅读 · 2021年6月21日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CVPR2020-莫斯科Yandex】双曲图像嵌入，Hyperbolic Image Embeddings

【CVPR2020-莫斯科Yandex】双曲图像嵌入，Hyperbolic Image Embeddings

专知会员服务

40+阅读 · 2020年4月12日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

图像和文本的融合表示学习——Text2Image和Image2Text

图像和文本的融合表示学习——Text2Image和Image2Text

专知

125+阅读 · 2018年6月11日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

HeadSculpt: Crafting 3D Head Avatars with Text

Arxiv

0+阅读 · 2023年6月5日

Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion

Arxiv

0+阅读 · 2023年6月5日

Weakly-Supervised Conditional Embedding for Referred Visual Search

Arxiv

0+阅读 · 2023年6月5日

Zero shot framework for satellite image restoration

Arxiv

0+阅读 · 2023年6月5日

ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation

Arxiv

0+阅读 · 2023年6月5日

Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

Arxiv

0+阅读 · 2023年6月5日

Diffusion Self-Guidance for Controllable Image Generation

Arxiv

1+阅读 · 2023年6月2日

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation

Arxiv

0+阅读 · 2023年6月1日

Speech inpainting: Context-based speech synthesis guided by video

Arxiv

0+阅读 · 2023年6月1日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

非线性Schrödinger方程孤立子和怪波的数值方法

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂语义的个性化图像集摘要研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度学习的音乐特征学习与分类

国家自然科学基金

7+阅读 · 2014年12月31日

鹿茸多肽诱导间充质干细胞定向分化为神经元的机理研究

国家自然科学基金

1+阅读 · 2014年12月31日

MSC-L的抗肿瘤基质细胞作用及其抑癌功效的研究

国家自然科学基金

0+阅读 · 2012年12月31日

空间编码可控的快速MRI高分辨率图像稀疏重建

国家自然科学基金

1+阅读 · 2012年12月31日

图像语义自动文本描述技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于多尺度边缘感知的图像平滑和分层编辑研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向普适环境的流媒体柔性机理与调度策略研究

国家自然科学基金

0+阅读 · 2009年12月31日

过渡金属催化卤代芳烃对芳醛的Barbier类型反应研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员