学界 | 上海交大卢策吾团队开源PointSIFT刷新点云语义分割记录

2018 年 7 月 14 日 机器之心

机器之心发布

上海交通大学

上海交通大学卢策吾团队 MVIG 实验室最近开源了 PointSIFT，这是一个点云特征的提取模块。在 Stanford Large-Scale 3D Indoor Spaces(S3DIS) [1] 中可以达到 70.23 的 mIoU（对比 PointCNN 62.74, 相对提高 11.9%）。在另一个常用数据集Stanford ScanNet [2]上可以达到 41.50 的 mIoU（对比 PointNet++ 38.28, 相对提高 8.1%）。

论文：PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation

论文作者：Mingyang Jiang、Yiran Wu、Cewu Lu （通讯作者）

阅读论文：arXiv:1807.00652, 2018；https://arxiv.org/abs/1807.00652
论文主页：http://www.mvig.org/publications/pointSIFT.html
代码链接：https://github.com/MVIG-SJTU/pointSIFT

众所周知，三维语义分割存在着很多的挑战。由于计算量的原因，我们无法将 2D 卷积神经网络直接推广到 3D。自从 PointNet 系列出现之后，大家开始使用原始点云作为基本输入。这样做能够的保留原始数据的内在关系，并且也能够减少很多不必要的计算。

这个框架现阶段仍有一些问题, 比如每个点操作过于独立，而无法高效刻画相关区域的语义结构。针对这些问题，受到传统 SIFT feature 设计的启发，上海交大 MVIG 组提出了基于 PointSIFT 算子的框架。在结构语义描述上，传统 SIFT feature 设计是最有效的描述算子之一。在图像上，SIFT 算子能编码区间上各个方向的信息，同时选择最合适的表征尺度。我们的 pointSIFT 将其设计思想推广到 3D 点云域上，对于每一个点云能端对端地输出一个表征向量，该向量编码了各个方向的信息，同时自适应地选择合适的表征尺度。不同于 SIFT 传统算法，我们采用网络结构，网络参数是由训练获得。

三维点云 PointSift 模块与图像 SIFT 算子的类比

pointSift 模块作为一个通用提高表征能力的模块，可以灵活地嵌入在各种 pointnet 框架中，比如下图所示。

基于 pointSIFT 嵌入的点云分割网络 SA 和 FP 分别为编码器 (Set Abstraction) 和解码器 (Feature Propagation) 模块

参考文献：

Iro Armeni, Ozan Sener, Amir R. Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2016.
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. arXiv preprint arXiv:1612.00593, 2016.
Lyne P. Tchapmi, Christopher B. Choy, Iro Armeni, JunYoung Gwak, and Silvio Savarese. Segcloud: Semantic segmentation of 3d point clouds. CoRR, abs/1710.07563, 2017.
Loïc Landrieu and Martin Simonovsky. Large-scale point cloud semantic segmentation with superpoint graphs. CoRR, abs/1711.09869, 2017.
Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413, 2017.
Y. Li, R. Bu, M. Sun, and B. Chen. PointCNN. ArXiv e-prints, January 2018.

Prof. Cewu Lu is a research Professor at Shanghai Jiao Tong University, leading Machine Vision and Intelligence Group. He is also one of MIT TR35 -"MIT Technology Review, 35 Innovators Under 35 (China)". He was Postdoc at Stanford AI lab (under Fei-Fei Li and Leonidas Guibas) and selected as the 1000 Overseas Talent Plan (Young Talent) (中组部青年千人计划).

本文为机器之心发布，转载请联系本公众号获得授权。

✄------------------------------------------------

加入机器之心（全职记者 / 实习生）：hr@jiqizhixin.com

投稿或寻求报道：content@jiqizhixin.com

广告 & 商务合作：bd@jiqizhixin.com

登录查看更多

相关内容

卢策吾

关注 3

卢策吾，上海交通大学研究员，博士生导师，国家海外高层次青年人才，2018年被《麻省理工科技评论》评委35位35岁以下中国科技精英（MIT TR35），2019年获求是杰出青年学者，2020年获上海市科技进步特等奖（排名第三）。在《自然》机器智能子刊、TPAMI、CVPR等高水平期刊和会议发表论近70篇CCF A类论文，11篇扩展版ESI高被引论文），担任《科学》《自然-机器智能》审稿人，CVPR 2020、ICCV 2021、IROS 2021领域主席。研究兴趣包括：行为理解、机器人学习。代表作有人体姿态估计Alphapose（GitHub Star 5000+），HAKE（人体行为引擎），GraspNet（高性能机器人抓取系统）等。

CVPR 2020 最佳论文与最佳学生论文！

专知会员服务

36+阅读 · 2020年6月17日

【CVPR2020-Oral】用于深度网络的任务感知超参数

专知会员服务

28+阅读 · 2020年5月25日

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

专知会员服务

45+阅读 · 2020年4月17日

【CVPR2020-香港中文大学】PointGroup:用于3D实例分割的双设置点分组，PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation

专知会员服务

12+阅读 · 2020年4月6日