SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis - 专知论文

会员服务 ·

0

MoDELS · Guidance · Processing（编程语言） · 图 · Prompt ·

2023 年 4 月 28 日

SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis

翻译：暂无翻译

Azade Farshad,Yousef Yeganeh,Yu Chi,Chengzhi Shen,Björn Ommer,Nassir Navab

Text-conditioned image generation has made significant progress in recent years with generative adversarial networks and more recently, diffusion models. While diffusion models conditioned on text prompts have produced impressive and high-quality images, accurately representing complex text prompts such as the number of instances of a specific object remains challenging. To address this limitation, we propose a novel guidance approach for the sampling process in the diffusion model that leverages bounding box and segmentation map information at inference time without additional training data. Through a novel loss in the sampling process, our approach guides the model with semantic features from CLIP embeddings and enforces geometric constraints, leading to high-resolution images that accurately represent the scene. To obtain bounding box and segmentation map information, we structure the text prompt as a scene graph and enrich the nodes with CLIP embeddings. Our proposed model achieves state-of-the-art performance on two public benchmarks for image generation from scene graphs, surpassing both scene graph to image and text-based diffusion models in various metrics. Our results demonstrate the effectiveness of incorporating bounding box and segmentation map guidance in the diffusion model sampling process for more accurate text-to-image generation.

翻译：暂无翻译

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

隐重子图条件下图的圈

国家自然科学基金

0+阅读 · 2015年12月31日

面向高光谱遥感成像的空谱三维压缩感知方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

脂肪细胞来源Microparticles介导新生血管生成在2型糖尿病易损斑块形成中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

锆钛酸铅与铌酸钾钠铁电材料中子和伽马辐照效应及损伤机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

奇异摄动数值模拟的多谱多尺度有限元特征值分解的研究

国家自然科学基金

0+阅读 · 2013年12月31日

玉米抗甘蔗花叶病毒病基因Scmv1的功能和抗病机理

国家自然科学基金

0+阅读 · 2012年12月31日

低秩矩阵复原的Schatten-q(0<q<1)正则化理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

钙钛矿材料极化表面的结构与稳定化机理

国家自然科学基金

0+阅读 · 2011年12月31日

大气中氮氧化物与海盐粒子的非均相反应机理

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Neural Scene Chronology

Arxiv

0+阅读 · 2023年6月13日

Contextual Dictionary Lookup for Knowledge Graph Completion

Arxiv

1+阅读 · 2023年6月13日

Fill-Up: Balancing Long-Tailed Data with Generative Models

Arxiv

0+阅读 · 2023年6月12日

Diffusion Models for Black-Box Optimization

Arxiv

0+阅读 · 2023年6月12日

AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network

Arxiv

0+阅读 · 2023年6月12日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年6月12日

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Arxiv

0+阅读 · 2023年6月9日

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt

Arxiv

0+阅读 · 2023年6月9日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning

Arxiv

11+阅读 · 2021年12月16日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

视觉-语言-动作模型解析：从模块构成到里程碑与挑战

《解析陆域作战方向：一个概念性框架》报告

【博士论文】基于多模态基础模型的上下文学习

追寻真正的AI自主性：从遗留思维到战场优势

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Neural Scene Chronology

Arxiv

0+阅读 · 2023年6月13日

Contextual Dictionary Lookup for Knowledge Graph Completion

Arxiv

1+阅读 · 2023年6月13日

Fill-Up: Balancing Long-Tailed Data with Generative Models

Arxiv

0+阅读 · 2023年6月12日

Diffusion Models for Black-Box Optimization

Arxiv

0+阅读 · 2023年6月12日

AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network

Arxiv

0+阅读 · 2023年6月12日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年6月12日

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Arxiv

0+阅读 · 2023年6月9日

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt

Arxiv

0+阅读 · 2023年6月9日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning

Arxiv

11+阅读 · 2021年12月16日

相关基金

隐重子图条件下图的圈

国家自然科学基金

0+阅读 · 2015年12月31日

面向高光谱遥感成像的空谱三维压缩感知方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

脂肪细胞来源Microparticles介导新生血管生成在2型糖尿病易损斑块形成中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

锆钛酸铅与铌酸钾钠铁电材料中子和伽马辐照效应及损伤机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

奇异摄动数值模拟的多谱多尺度有限元特征值分解的研究

国家自然科学基金

0+阅读 · 2013年12月31日

玉米抗甘蔗花叶病毒病基因Scmv1的功能和抗病机理

国家自然科学基金

0+阅读 · 2012年12月31日

低秩矩阵复原的Schatten-q(0<q<1)正则化理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

钙钛矿材料极化表面的结构与稳定化机理

国家自然科学基金

0+阅读 · 2011年12月31日

大气中氮氧化物与海盐粒子的非均相反应机理

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员