SKED: 基于草图引导的文本到 3D 编辑 (SKED: Sketch-guided Text-based 3D Editing) - 专知论文

会员服务 ·

0

3D · INTERACT · MoDELS · state-of-the-art · 损失函数（机器学习） ·

2023 年 3 月 19 日

SKED: Sketch-guided Text-based 3D Editing

翻译：SKED: 基于草图引导的文本到 3D 编辑

Aryan Mikaeili,Or Perel,Daniel Cohen-Or,Ali Mahdavi-Amiri

Text-to-image diffusion models are gradually introduced into computer graphics, recently enabling the development of Text-to-3D pipelines in an open domain. However, for interactive editing purposes, local manipulations of content through a simplistic textual interface can be arduous. Incorporating user guided sketches with Text-to-image pipelines offers users more intuitive control. Still, as state-of-the-art Text-to-3D pipelines rely on optimizing Neural Radiance Fields (NeRF) through gradients from arbitrary rendering views, conditioning on sketches is not straightforward. In this paper, we present SKED, a technique for editing 3D shapes represented by NeRFs. Our technique utilizes as few as two guiding sketches from different views to alter an existing neural field. The edited region respects the prompt semantics through a pre-trained diffusion model. To ensure the generated output adheres to the provided sketches, we propose novel loss functions to generate the desired edits while preserving the density and radiance of the base instance. We demonstrate the effectiveness of our proposed method through several qualitative and quantitative experiments.

翻译：文本到图像扩散模型正在逐渐引入计算机图形学中，最近使得开放域的文本到 3D 管道得以开发。然而，对于交互式编辑目的，通过简单的文本界面进行局部内容操作可能会很困难。将用户引导草图与文本到图像管道结合起来，为用户提供更直观的控制。但是，由于最先进的文本到 3D 管道依赖于通过任意渲染视图的梯度来优化神经辐射场（NeRF），因此对草图的条件限制并不直观。在本文中，我们提出了 SKED，一种用于编辑由 NeRF 表示的 3D 形状的技术。我们的技术使用来自不同视角的至少两个引导草图来改变现有的神经场。通过预训练的扩散模型，编辑区域遵循提示语义。为确保生成的输出符合提供的草图，我们提出了新颖的损失函数，以生成所需的编辑，同时保留基础实例的密度和辐射。我们通过多个定性和定量实验证明了我们提出的方法的有效性。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

专知会员服务

54+阅读 · 2023年4月1日

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

专知会员服务

22+阅读 · 2022年3月18日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

【ACL2021】预训练语言模型的少样本知识图谱文本生成

专知会员服务

39+阅读 · 2021年6月6日

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

专知会员服务

30+阅读 · 2020年3月28日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

3D版DALL-E来了！谷歌发布文本3D生成模型DreamFusion，重点是zero-shot

3D版DALL-E来了！谷歌发布文本3D生成模型DreamFusion，重点是zero-shot

新智元

0+阅读 · 2022年10月8日

港科大&MSRA新研究：关于图像到图像转换，Finetuning is all you need

港科大&MSRA新研究：关于图像到图像转换，Finetuning is all you need

机器之心

0+阅读 · 2022年6月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

PSMA通过TRAF6和TTC3调控前列腺癌细胞自噬在CRPC产生过程中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

一种新型的低频高分辨率水声成像方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于动态点云的人脸表情建模和编辑方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于部分参考图像质量评估的二维矢量图形快速渲染技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于相机的低质量文本图像的复原与增强关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

高转矩密度磁力齿轮传动损耗与动态特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

实时双模态自动图像软标注与多关键词检索

国家自然科学基金

0+阅读 · 2009年12月31日

基于边缘点的折反射图像立体匹配与三维重建研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于边缘引导区域分级合并的高分辨率遥感模糊分类

国家自然科学基金

0+阅读 · 2009年12月31日

大流量2D数字伺服阀关键技术的基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

Meta-Learners for Few-Shot Weakly-Supervised Medical Image Segmentation

Arxiv

0+阅读 · 2023年5月11日

WeditGAN: Few-shot Image Generation via Latent Space Relocation

Arxiv

0+阅读 · 2023年5月11日

Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-Expert Users

Arxiv

0+阅读 · 2023年5月11日

Relightify: Relightable 3D Faces from a Single Image via Diffusion Models

Arxiv

1+阅读 · 2023年5月10日

iEdit: Localised Text-guided Image Editing with Weak Supervision

Arxiv

0+阅读 · 2023年5月10日

Multi-Teacher Knowledge Distillation For Text Image Machine Translation

Arxiv

0+阅读 · 2023年5月10日

Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval

Arxiv

0+阅读 · 2023年5月9日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering

Arxiv

16+阅读 · 2019年12月16日

VIP会员

文章信息

相关主题

state-of-the-art

损失函数（机器学习）

相关VIP内容

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

专知会员服务

54+阅读 · 2023年4月1日

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

专知会员服务

22+阅读 · 2022年3月18日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

【ACL2021】预训练语言模型的少样本知识图谱文本生成

专知会员服务

39+阅读 · 2021年6月6日

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

专知会员服务

30+阅读 · 2020年3月28日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

3D版DALL-E来了！谷歌发布文本3D生成模型DreamFusion，重点是zero-shot

3D版DALL-E来了！谷歌发布文本3D生成模型DreamFusion，重点是zero-shot

新智元

0+阅读 · 2022年10月8日

港科大&MSRA新研究：关于图像到图像转换，Finetuning is all you need

港科大&MSRA新研究：关于图像到图像转换，Finetuning is all you need

机器之心

0+阅读 · 2022年6月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

相关论文

Meta-Learners for Few-Shot Weakly-Supervised Medical Image Segmentation

Arxiv

0+阅读 · 2023年5月11日

WeditGAN: Few-shot Image Generation via Latent Space Relocation

Arxiv

0+阅读 · 2023年5月11日

Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-Expert Users

Arxiv

0+阅读 · 2023年5月11日

Relightify: Relightable 3D Faces from a Single Image via Diffusion Models

Arxiv

1+阅读 · 2023年5月10日

iEdit: Localised Text-guided Image Editing with Weak Supervision

Arxiv

0+阅读 · 2023年5月10日

Multi-Teacher Knowledge Distillation For Text Image Machine Translation

Arxiv

0+阅读 · 2023年5月10日

Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval

Arxiv

0+阅读 · 2023年5月9日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering

Arxiv

16+阅读 · 2019年12月16日

相关基金

PSMA通过TRAF6和TTC3调控前列腺癌细胞自噬在CRPC产生过程中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

一种新型的低频高分辨率水声成像方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于动态点云的人脸表情建模和编辑方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于部分参考图像质量评估的二维矢量图形快速渲染技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于相机的低质量文本图像的复原与增强关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

高转矩密度磁力齿轮传动损耗与动态特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

实时双模态自动图像软标注与多关键词检索

国家自然科学基金

0+阅读 · 2009年12月31日

基于边缘点的折反射图像立体匹配与三维重建研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于边缘引导区域分级合并的高分辨率遥感模糊分类

国家自然科学基金

0+阅读 · 2009年12月31日

大流量2D数字伺服阀关键技术的基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员