PlaneTR: 3D平板恢复结构辅助变换器 (PlaneTR: Structure-Guided Transformers for 3D Plane Recovery) - 专知论文

会员服务 ·

0

示例 · Networking · INFORMS · Performer · 变换 ·

2021 年 7 月 27 日

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

翻译：PlaneTR: 3D平板恢复结构辅助变换器

Bin Tan,Nan Xue,Song Bai,Tianfu Wu,Gui-Song Xia

from arxiv, ICCV 2021; Code: https://git.io/PlaneTR

This paper presents a neural network built upon Transformers, namely PlaneTR, to simultaneously detect and reconstruct planes from a single image. Different from previous methods, PlaneTR jointly leverages the context information and the geometric structures in a sequence-to-sequence way to holistically detect plane instances in one forward pass. Specifically, we represent the geometric structures as line segments and conduct the network with three main components: (i) context and line segments encoders, (ii) a structure-guided plane decoder, (iii) a pixel-wise plane embedding decoder. Given an image and its detected line segments, PlaneTR generates the context and line segment sequences via two specially designed encoders and then feeds them into a Transformers-based decoder to directly predict a sequence of plane instances by simultaneously considering the context and global structure cues. Finally, the pixel-wise embeddings are computed to assign each pixel to one predicted plane instance which is nearest to it in embedding space. Comprehensive experiments demonstrate that PlaneTR achieves a state-of-the-art performance on the ScanNet and NYUv2 datasets.

翻译：本文展示了以变形器(即PlaneTR)为基础的神经网络,以同时从单一图像中探测和再造平面。与以往的方法不同,PlaneTR以顺序到顺序的方式共同利用上下文和几何结构,以整体地探测一个远道的平面事件。具体地说,我们将几何结构作为线段,并用三个主要组成部分进行网络:(一) 上下文和线段编码器,(二) 结构引导平面解码器,(三) 结构智能平面嵌入解码器。根据图像及其探测到的线段,PlaneTR通过两个专门设计的编码器生成上下文和线段序列,然后将其输入一个基于变形器的解码器,以便通过同时考虑上下文和全球结构提示,直接预测平面事件的顺序。最后,对像素错误嵌入器进行计算,将每个像素指派给一个最接近其嵌入空间的预测平面图。全面实验显示,PlaneTR在扫描网和NY数据系统上取得了最先进的性能。

0

相关内容

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【硬核书】流形几何结构，358页pdf，Maryland大学WILLIAM教授编著

专知会员服务

43+阅读 · 2021年7月30日

东京大学 | TrTr：基于Transformer的目标跟踪

专知会员服务

36+阅读 · 2021年5月12日

图像分割方法综述

图像分割方法综述

专知会员服务

56+阅读 · 2020年11月22日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

专知会员服务

29+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知

18+阅读 · 2020年10月11日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Adaptive Attribute and Structure Subspace Clustering Network

Arxiv

0+阅读 · 2021年9月28日

Dynamic Data Structures for $k$-Nearest Neighbor Queries

Arxiv

0+阅读 · 2021年9月24日

Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images

Arxiv

1+阅读 · 2021年9月24日

Multi-View Video-Based 3D Hand Pose Estimation

Arxiv

0+阅读 · 2021年9月24日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Multimodal Deep Network Embedding with Integrated Structure and Attribute Information

Multimodal Deep Network Embedding with Integrated Structure and Attribute Information

Arxiv

4+阅读 · 2019年3月28日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding

Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding

Arxiv

7+阅读 · 2019年2月26日

Monocular Object and Plane SLAM in Structured Environments

Monocular Object and Plane SLAM in Structured Environments

Arxiv

12+阅读 · 2018年9月10日

Reconstruction Network for Video Captioning

Arxiv

5+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

相关VIP内容

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【硬核书】流形几何结构，358页pdf，Maryland大学WILLIAM教授编著

专知会员服务

43+阅读 · 2021年7月30日

东京大学 | TrTr：基于Transformer的目标跟踪

专知会员服务

36+阅读 · 2021年5月12日

图像分割方法综述

图像分割方法综述

专知会员服务

56+阅读 · 2020年11月22日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

专知会员服务

29+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知

18+阅读 · 2020年10月11日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Adaptive Attribute and Structure Subspace Clustering Network

Arxiv

0+阅读 · 2021年9月28日

Dynamic Data Structures for $k$-Nearest Neighbor Queries

Arxiv

0+阅读 · 2021年9月24日

Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images

Arxiv

1+阅读 · 2021年9月24日

Multi-View Video-Based 3D Hand Pose Estimation

Arxiv

0+阅读 · 2021年9月24日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Multimodal Deep Network Embedding with Integrated Structure and Attribute Information

Multimodal Deep Network Embedding with Integrated Structure and Attribute Information

Arxiv

4+阅读 · 2019年3月28日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding

Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding

Arxiv

7+阅读 · 2019年2月26日

Monocular Object and Plane SLAM in Structured Environments

Monocular Object and Plane SLAM in Structured Environments

Arxiv

12+阅读 · 2018年9月10日

Reconstruction Network for Video Captioning

Arxiv

5+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员