图像2Point: 3D点- 点- 点- 城市与预先训练的 2D ConvNets 的 3D 点- 点- 城市理解 (Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets) - 专知论文

会员服务 ·

0

可理解性 · MoDELS · 模型评估 · Performer · ConvNets ·

2022 年 4 月 21 日

Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets

翻译：图像2Point: 3D点- 点- 点- 城市与预先训练的 2D ConvNets 的 3D 点- 点- 城市理解

Chenfeng Xu,Shijia Yang,Tomer Galanti,Bichen Wu,Xiangyu Yue,Bohan Zhai,Wei Zhan,Peter Vajda,Kurt Keutzer,Masayoshi Tomizuka

from arxiv, The code is avaliable at: \url{https://github.com/chenfengxu714/image2point}

3D point-clouds and 2D images are different visual representations of the physical world. While human vision can understand both representations, computer vision models designed for 2D image and 3D point-cloud understanding are quite different. Our paper explores the potential of transferring 2D model architectures and weights to understand 3D point-clouds, by empirically investigating the feasibility of the transfer, the benefits of the transfer, and shedding light on why the transfer works. We discover that we can indeed use the same architecture and pretrained weights of a neural net model to understand both images and point-clouds. Specifically, we transfer the image-pretrained model to a point-cloud model by copying or inflating the weights. We find that \textbf{f}inetuning the transformed \textbf{i}mage-\textbf{p}retrained models (FIP) with minimal efforts -- only on input, output, and normalization layers -- can achieve competitive performance on 3D point-cloud classification, beating a wide range of point-cloud models that adopt task-specific architectures and use a variety of tricks. When finetuning the whole model, the performance improves even further. Meanwhile, FIP improves data efficiency, reaching up to 10.0 top-1 accuracy percent on few-shot classification. It also speeds up the training of point-cloud models by up to 11.1x for a target accuracy (e.g., 90 \% accuracy). Lastly, we provide an explanation of the image to point-cloud transfer from the aspect of \textit{neural collapse}. The code is available at: \url{https://github.com/chenfengxu714/image2point}.

翻译：3D 点球和 2D 图像是物理世界的不同视觉表示 { 3D 点球和 2D 图像。虽然人类的视觉可以理解两种表达方式, 但为 2D 图像和 3D 点球理解而设计的计算机视觉模型非常不同。我们的论文探索了将 2D 模型结构和重量转换为 3D 点球的可能性, 通过实验性地调查传输的可行性、传输的好处以及显示传输工作为何起作用。我们发现, 我们确实可以使用相同的结构和神经网模型预设重量来理解图像和点球。具体地说, 我们通过复制或缩放重量来将图像预设的模型转换为点球球形模型。我们发现, 以最小的输入、输出和正常化的层次, 可以在 3D 点分类上实现有竞争力的图像转换, 击打一个宽度的点球线码模型。采用特定任务结构的顶端点, 并使用一系列的数据。改进了。。在水平上, 改进了节数。

0

相关内容

可理解性

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

面向CELP语音压缩域的通用隐写分析方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

离散曲面的局部形状特征描述及应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于ChIA-PET系统解析猪骨骼肌卫星细胞3D基因组成肌分化调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

CK2诱导的PTEN蛋白Ser380/Thr382/383位点磷酸化激活PI3K/Akt信号通路在胃癌形成、生长及转移中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

各向同性和TI弹性波方程高精度有限差分数值解法新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

跟踪器融合的视觉跟踪方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于球面小波的3D曲面形状描述方法及AD脑皮层分类研究

国家自然科学基金

1+阅读 · 2013年12月31日

3D H.264视频的无帧内失真漂移隐写方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载INS/WSN/机器视觉组合导航鲁棒滤波方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Unsupervised and Few-shot Parsing from Pretrained Language Models

Unsupervised and Few-shot Parsing from Pretrained Language Models

Arxiv

0+阅读 · 2022年6月10日

PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

Arxiv

1+阅读 · 2022年6月9日

AGConv: Adaptive Graph Convolution on 3D Point Clouds

Arxiv

0+阅读 · 2022年6月9日

Unveiling Transformers with LEGO: a synthetic reasoning task

Arxiv

0+阅读 · 2022年6月9日

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月8日

Masked Unsupervised Self-training for Zero-shot Image Classification

Arxiv

0+阅读 · 2022年6月7日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

3D Deep Learning on Medical Images: A Review

3D Deep Learning on Medical Images: A Review

Arxiv

13+阅读 · 2020年4月1日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Unsupervised and Few-shot Parsing from Pretrained Language Models

Unsupervised and Few-shot Parsing from Pretrained Language Models

Arxiv

0+阅读 · 2022年6月10日

PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

Arxiv

1+阅读 · 2022年6月9日

AGConv: Adaptive Graph Convolution on 3D Point Clouds

Arxiv

0+阅读 · 2022年6月9日

Unveiling Transformers with LEGO: a synthetic reasoning task

Arxiv

0+阅读 · 2022年6月9日

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月8日

Masked Unsupervised Self-training for Zero-shot Image Classification

Arxiv

0+阅读 · 2022年6月7日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

3D Deep Learning on Medical Images: A Review

3D Deep Learning on Medical Images: A Review

Arxiv

13+阅读 · 2020年4月1日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

相关基金

面向CELP语音压缩域的通用隐写分析方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

离散曲面的局部形状特征描述及应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于ChIA-PET系统解析猪骨骼肌卫星细胞3D基因组成肌分化调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

CK2诱导的PTEN蛋白Ser380/Thr382/383位点磷酸化激活PI3K/Akt信号通路在胃癌形成、生长及转移中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

各向同性和TI弹性波方程高精度有限差分数值解法新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

跟踪器融合的视觉跟踪方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于球面小波的3D曲面形状描述方法及AD脑皮层分类研究

国家自然科学基金

1+阅读 · 2013年12月31日

3D H.264视频的无帧内失真漂移隐写方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载INS/WSN/机器视觉组合导航鲁棒滤波方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员