图像2Point: 3D点- 点- 点- 城市与预先训练的 2D ConvNets 的 3D 点- 点- 城市理解 (Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets) - 专知论文

会员服务 ·

0

可理解性 · ConvNets · MoDELS · 3D · Vision ·

2021 年 6 月 8 日

Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets

翻译：图像2Point: 3D点- 点- 点- 城市与预先训练的 2D ConvNets 的 3D 点- 点- 城市理解

Chenfeng Xu,Shijia Yang,Bohan Zhai,Bichen Wu,Xiangyu Yue,Wei Zhan,Peter Vajda,Kurt Keutzer,Masayoshi Tomizuka

from arxiv, The code is avaliable at: \url{https://github.com/chenfengxu714/image2point}

3D point-clouds and 2D images are different visual representations of the physical world. While human vision can understand both representations, computer vision models designed for 2D image and 3D point-cloud understanding are quite different. Our paper investigates the potential for transferability between these two representations by empirically investigating whether this approach works, what factors affect the transfer performance, and how to make it work even better. We discovered that we can indeed use the same neural net model architectures to understand both images and point-clouds. Moreover, we can transfer pretrained weights from image models to point-cloud models with minimal effort. Specifically, based on a 2D ConvNet pretrained on an image dataset, we can transfer the image model to a point-cloud model by \textit{inflating} 2D convolutional filters to 3D then finetuning its input, output, and optionally normalization layers. The transferred model can achieve competitive performance on 3D point-cloud classification, indoor and driving scene segmentation, even beating a wide range of point-cloud models that adopt task-specific architectures and use a variety of tricks.

翻译：3D 点球和 2D 图像是物理世界的不同视觉表现形式。虽然人类的视觉可以理解两种表达方式, 但为 2D 图像和 3D 点球理解设计的计算机视觉模型是完全不同的。我们的论文通过实验性调查该方法是否有效、哪些因素影响转移性能以及如何使其更好发挥作用来调查这两种表达方式之间的可转移性。我们发现, 我们确实可以使用相同的神经网模型结构来理解图像和点球。此外, 我们还可以尽量把预制的重量从图像模型转移到点球模型。具体地说, 基于 2D ConvNet 预先在图像数据集上训练的2D ConvNet, 我们可以通过\ textit{ 膨胀} 2D 进化过滤器将图像模型转移到点球模型到 3D, 然后调整其输入、输出和可选的正常化层。传输模型可以在 3D 点球分分类、室内和驱动场分割上实现竞争性表现, 甚至可以击打一系列采用特定结构并使用各种策略的点球形模型。

0

相关内容

可理解性

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

3D目标检测进展综述

3D目标检测进展综述

专知会员服务

193+阅读 · 2020年4月24日

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

专知会员服务

24+阅读 · 2020年4月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

泡泡机器人SLAM

12+阅读 · 2019年5月20日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

Deep Learning for 3D Point Cloud Understanding: A Survey

Deep Learning for 3D Point Cloud Understanding: A Survey

Arxiv

10+阅读 · 2020年9月18日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

The Perfect Match: 3D Point Cloud Matching with Smoothed Densities

Arxiv

3+阅读 · 2019年4月2日

Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method

Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method

Arxiv

5+阅读 · 2018年12月19日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

5+阅读 · 2018年7月24日

Fine-grained Video Classification and Captioning

Arxiv

7+阅读 · 2018年4月24日

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

Arxiv

7+阅读 · 2018年3月30日

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

Arxiv

8+阅读 · 2017年11月22日

VIP会员

文章信息

相关主题

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

3D目标检测进展综述

3D目标检测进展综述

专知会员服务

193+阅读 · 2020年4月24日

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

专知会员服务

24+阅读 · 2020年4月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

泡泡机器人SLAM

12+阅读 · 2019年5月20日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

Deep Learning for 3D Point Cloud Understanding: A Survey

Deep Learning for 3D Point Cloud Understanding: A Survey

Arxiv

10+阅读 · 2020年9月18日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

The Perfect Match: 3D Point Cloud Matching with Smoothed Densities

Arxiv

3+阅读 · 2019年4月2日

Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method

Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method

Arxiv

5+阅读 · 2018年12月19日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

5+阅读 · 2018年7月24日

Fine-grained Video Classification and Captioning

Arxiv

7+阅读 · 2018年4月24日

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

Arxiv

7+阅读 · 2018年3月30日

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

Arxiv

8+阅读 · 2017年11月22日

微信扫码咨询专知VIP会员