农场3D：通过提炼2D扩散学习关节式3D动物 (Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion) - 专知论文

会员服务 ·

0

重建 · 3D · 生成器 · 图像生成 · 合成 ·

2023 年 4 月 20 日

Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion

翻译：农场3D：通过提炼2D扩散学习关节式3D动物

Tomas Jakab,Ruining Li,Shangzhe Wu,Christian Rupprecht,Andrea Vedaldi

from arxiv, Project page: http://farm3d.github.io

We present Farm3D, a method to learn category-specific 3D reconstructors for articulated objects entirely from "free" virtual supervision from a pre-trained 2D diffusion-based image generator. Recent approaches can learn, given a collection of single-view images of an object category, a monocular network to predict the 3D shape, albedo, illumination and viewpoint of any object occurrence. We propose a framework using an image generator like Stable Diffusion to generate virtual training data for learning such a reconstruction network from scratch. Furthermore, we include the diffusion model as a score to further improve learning. The idea is to randomise some aspects of the reconstruction, such as viewpoint and illumination, generating synthetic views of the reconstructed 3D object, and have the 2D network assess the quality of the resulting image, providing feedback to the reconstructor. Different from work based on distillation which produces a single 3D asset for each textual prompt in hours, our approach produces a monocular reconstruction network that can output a controllable 3D asset from a given image, real or generated, in only seconds. Our network can be used for analysis, including monocular reconstruction, or for synthesis, generating articulated assets for real-time applications such as video games.

翻译：- 我们提出了一种称为农场3D的方法，可以完全从预训练的2D扩散图像生成器的“自由”虚拟监督中学习类别特定的关节式对象的3D重建器。最近的方法可以学习给定物体类别的单视图图像集合，预测任何物体出现的3D形状，反照率，照明和视点。我们提出了一种使用像稳定扩散这样的图像生成器的框架，从头开始学习这样的重建网络，生成虚拟训练数据。此外，我们将扩散模型作为得分加入到网络中，以进一步改善学习效果。其主要思想是随机化某些重建的方面，例如视点和照明，生成重建3D对象的合成视图，并让2D网络评估生成的图像的质量，向重建器提供反馈。与基于蒸馏的工作不同，其针对每个文本提示仅生成单个3D资产，并花费数小时，我们的方法在几秒钟内生成一个可以从给定的图像（真实或合成）输出可控3D资产的单眼重建网络。我们的网络可用于分析，包括单目重建，也可以用于综合，为实时应用程序（如视频游戏）生成关节式资产。

0

相关内容

【CVPR2023】Mask3D:通过学习掩码3D先验对2D视觉transformer进行预训练

【CVPR2023】Mask3D:通过学习掩码3D先验对2D视觉transformer进行预训练

专知会员服务

24+阅读 · 2023年4月9日

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

专知会员服务

18+阅读 · 2022年3月15日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【CVPR 2022】paper解读——从头盔信号中解析生成3D姿势，这为AR/VR创造可信虚拟形象迈出了重要一步，FLAG: Flow-based 3D Avatar Generation from Sparse Observations

专知会员服务

19+阅读 · 2022年3月6日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

CVPR 2020 最佳论文与最佳学生论文！

CVPR 2020 最佳论文与最佳学生论文！

专知会员服务

36+阅读 · 2020年6月17日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【斯坦福大学】对抗性表征主动学习，Adversarial Representation Active Learning

【斯坦福大学】对抗性表征主动学习，Adversarial Representation Active Learning

专知会员服务

45+阅读 · 2019年12月20日

DeepMind开源最牛无监督学习BigBiGAN预训练模型

DeepMind开源最牛无监督学习BigBiGAN预训练模型

新智元

10+阅读 · 2019年10月10日

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

997篇-历史最全生成对抗网络（GAN）论文串烧

997篇-历史最全生成对抗网络（GAN）论文串烧

深度学习与NLP

16+阅读 · 2018年6月26日

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

泡泡机器人SLAM

20+阅读 · 2018年5月8日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

金属碳化物基低铂介孔催化材料的合成、界面设计与电催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

大规模数据集3D手语识别的研究

国家自然科学基金

1+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

再生核希尔伯特空间图像稀疏表达算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

"蛋黄-蛋壳型"金属氧化物纳米反应器的可控制备及催化特性

国家自然科学基金

0+阅读 · 2013年12月31日

联合显著性检测和对象分割的算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

GaN/电解液界面的原位STM研究

国家自然科学基金

0+阅读 · 2012年12月31日

高精细模型的向量位移映射表示及几何处理

国家自然科学基金

0+阅读 · 2011年12月31日

microRNA 和cis-nat-siRNA介导大麦细胞壁和β－D－葡聚糖合成调控的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Data-Efficient Approach for Long-Term Human Motion Prediction Using Maps of Dynamics

Arxiv

0+阅读 · 2023年6月6日

DFormer: Diffusion-guided Transformer for Universal Image Segmentation

Arxiv

1+阅读 · 2023年6月6日

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

Arxiv

0+阅读 · 2023年6月5日

Robust 3D-aware Object Classification via Discriminative Render-and-Compare

Arxiv

0+阅读 · 2023年6月5日

Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning

Arxiv

0+阅读 · 2023年6月5日

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

Arxiv

0+阅读 · 2023年6月2日

Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation

Arxiv

0+阅读 · 2023年6月2日

Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method

Arxiv

0+阅读 · 2023年6月2日

AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars

Arxiv

0+阅读 · 2023年6月2日

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Arxiv

11+阅读 · 2018年7月16日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2023】Mask3D:通过学习掩码3D先验对2D视觉transformer进行预训练

【CVPR2023】Mask3D:通过学习掩码3D先验对2D视觉transformer进行预训练

专知会员服务

24+阅读 · 2023年4月9日

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

专知会员服务

18+阅读 · 2022年3月15日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【CVPR 2022】paper解读——从头盔信号中解析生成3D姿势，这为AR/VR创造可信虚拟形象迈出了重要一步，FLAG: Flow-based 3D Avatar Generation from Sparse Observations

专知会员服务

19+阅读 · 2022年3月6日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

CVPR 2020 最佳论文与最佳学生论文！

CVPR 2020 最佳论文与最佳学生论文！

专知会员服务

36+阅读 · 2020年6月17日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【斯坦福大学】对抗性表征主动学习，Adversarial Representation Active Learning

【斯坦福大学】对抗性表征主动学习，Adversarial Representation Active Learning

专知会员服务

45+阅读 · 2019年12月20日

热门VIP内容

开通专知VIP会员享更多权益服务

人机协同作战规划：来自美海军陆战队的大语言模型（LLM）使用教训

对北约军事总部战略规划制定与实施的研究 | 140页

美联参会指南-联合规划与执行概述及政策框架 | 32页

俄罗斯军事规划差异性凸显其思维的重要性 | 2025最新文献

相关资讯

DeepMind开源最牛无监督学习BigBiGAN预训练模型

DeepMind开源最牛无监督学习BigBiGAN预训练模型

新智元

10+阅读 · 2019年10月10日

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

997篇-历史最全生成对抗网络（GAN）论文串烧

997篇-历史最全生成对抗网络（GAN）论文串烧

深度学习与NLP

16+阅读 · 2018年6月26日

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

泡泡机器人SLAM

20+阅读 · 2018年5月8日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

相关论文

A Data-Efficient Approach for Long-Term Human Motion Prediction Using Maps of Dynamics

Arxiv

0+阅读 · 2023年6月6日

DFormer: Diffusion-guided Transformer for Universal Image Segmentation

Arxiv

1+阅读 · 2023年6月6日

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

Arxiv

0+阅读 · 2023年6月5日

Robust 3D-aware Object Classification via Discriminative Render-and-Compare

Arxiv

0+阅读 · 2023年6月5日

Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning

Arxiv

0+阅读 · 2023年6月5日

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

Arxiv

0+阅读 · 2023年6月2日

Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation

Arxiv

0+阅读 · 2023年6月2日

Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method

Arxiv

0+阅读 · 2023年6月2日

AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars

Arxiv

0+阅读 · 2023年6月2日

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Arxiv

11+阅读 · 2018年7月16日

相关基金

金属碳化物基低铂介孔催化材料的合成、界面设计与电催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

大规模数据集3D手语识别的研究

国家自然科学基金

1+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

再生核希尔伯特空间图像稀疏表达算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

"蛋黄-蛋壳型"金属氧化物纳米反应器的可控制备及催化特性

国家自然科学基金

0+阅读 · 2013年12月31日

联合显著性检测和对象分割的算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

GaN/电解液界面的原位STM研究

国家自然科学基金

0+阅读 · 2012年12月31日

高精细模型的向量位移映射表示及几何处理

国家自然科学基金

0+阅读 · 2011年12月31日

microRNA 和cis-nat-siRNA介导大麦细胞壁和β－D－葡聚糖合成调控的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员