端至端自动驾驶多式拆解变异器 (Multi-Modal Fusion Transformer for End-to-End Autonomous Driving) - 专知论文

会员服务 ·

0

Integration · 端到端 · 可约的 · 传感器 · Performer ·

2021 年 4 月 19 日

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

翻译：端至端自动驾驶多式拆解变异器

Aditya Prakash,Kashyap Chitta,Andreas Geiger

from arxiv, CVPR 2021

How should representations from complementary sensors be integrated for autonomous driving? Geometry-based sensor fusion has shown great promise for perception tasks such as object detection and motion forecasting. However, for the actual driving task, the global context of the 3D scene is key, e.g. a change in traffic light state can affect the behavior of a vehicle geometrically distant from that traffic light. Geometry alone may therefore be insufficient for effectively fusing representations in end-to-end driving models. In this work, we demonstrate that imitation learning policies based on existing sensor fusion methods under-perform in the presence of a high density of dynamic agents and complex scenarios, which require global contextual reasoning, such as handling traffic oncoming from multiple directions at uncontrolled intersections. Therefore, we propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention. We experimentally validate the efficacy of our approach in urban settings involving complex scenarios using the CARLA urban driving simulator. Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.

翻译：以几何测量为基础的传感器聚合对物体探测和运动预测等感知任务大有希望,然而,对于实际驾驶任务而言,三维场景的全球背景是关键所在,例如,交通灯光状态的变化会影响远离交通灯光的车辆的行为。因此,仅靠几何本身可能不足以有效地在终端到终端驾驶模型中将演示结果引信化。在这项工作中,我们证明,基于现有感应聚变方法的模仿学习政策在动态剂和复杂情景高度密集的情况下表现不佳,这需要全球背景推理,例如处理来自不受控制的交叉点多方向的交通。因此,我们建议TransFuser、一个新的多式多式变形变形器,利用注意力整合图像和激光雷达表。我们用CAR城市驱动模拟器实验地验证了我们在城市环境中采用复杂情景的方法的有效性。我们的方法在使用CARLA城市驱动模拟器时,取得了最先进的驾驶性,同时将碰撞率降低76%,而与基于几何定位的反形反形。

0

相关内容

Integration

Integration：Integration, the VLSI Journal。 Explanation：集成，VLSI杂志。 Publisher：Elsevier。 SIT：http://dblp.uni-trier.de/db/journals/integration/

【CVPR2021】CVPR2021 | MotionRNN：针对复杂时空运动的通用视频预测模型

专知会员服务

14+阅读 · 2021年4月22日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

专知会员服务

51+阅读 · 2020年2月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

27+阅读 · 2020年1月17日

【CVPR 2019 | tutorial】阿波罗，开放式自主驾驶平台：Apollo， Open Autonomous Driving Platform

【CVPR 2019 | tutorial】阿波罗，开放式自主驾驶平台：Apollo， Open Autonomous Driving Platform

专知会员服务

32+阅读 · 2019年11月28日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

【泡泡一分钟】将3D全卷积网络应用于车辆激光点云处理（IROS-11）

【泡泡一分钟】将3D全卷积网络应用于车辆激光点云处理（IROS-11）

泡泡机器人SLAM

13+阅读 · 2018年3月23日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

【推荐】深度学习时序处理文献列表

【推荐】深度学习时序处理文献列表

机器学习研究会

7+阅读 · 2017年11月29日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

A Safe Hierarchical Planning Framework for Complex Driving Scenarios based on Reinforcement Learning

A Safe Hierarchical Planning Framework for Complex Driving Scenarios based on Reinforcement Learning

Arxiv

0+阅读 · 2021年6月9日

Predicting Vehicles Trajectories in Urban Scenarios with Transformer Networks and Augmented Information

Arxiv

0+阅读 · 2021年6月7日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Arxiv

5+阅读 · 2019年2月26日

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Arxiv

3+阅读 · 2018年12月13日

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Arxiv

3+阅读 · 2018年12月7日

3D-LaneNet: end-to-end 3D multiple lane detection

3D-LaneNet: end-to-end 3D multiple lane detection

Arxiv

7+阅读 · 2018年11月26日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv

3+阅读 · 2018年3月16日

Improving Multiple Object Tracking with Optical Flow and Edge Preprocessing

Arxiv

10+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2021】CVPR2021 | MotionRNN：针对复杂时空运动的通用视频预测模型

专知会员服务

14+阅读 · 2021年4月22日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

【上海交大-ICASSP2020】Transformer端到端的多说话人语音识别

专知会员服务

51+阅读 · 2020年2月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

27+阅读 · 2020年1月17日

【CVPR 2019 | tutorial】阿波罗，开放式自主驾驶平台：Apollo， Open Autonomous Driving Platform

【CVPR 2019 | tutorial】阿波罗，开放式自主驾驶平台：Apollo， Open Autonomous Driving Platform

专知会员服务

32+阅读 · 2019年11月28日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

【泡泡一分钟】将3D全卷积网络应用于车辆激光点云处理（IROS-11）

【泡泡一分钟】将3D全卷积网络应用于车辆激光点云处理（IROS-11）

泡泡机器人SLAM

13+阅读 · 2018年3月23日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

【推荐】深度学习时序处理文献列表

【推荐】深度学习时序处理文献列表

机器学习研究会

7+阅读 · 2017年11月29日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

A Safe Hierarchical Planning Framework for Complex Driving Scenarios based on Reinforcement Learning

A Safe Hierarchical Planning Framework for Complex Driving Scenarios based on Reinforcement Learning

Arxiv

0+阅读 · 2021年6月9日

Predicting Vehicles Trajectories in Urban Scenarios with Transformer Networks and Augmented Information

Arxiv

0+阅读 · 2021年6月7日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Arxiv

5+阅读 · 2019年2月26日

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Arxiv

3+阅读 · 2018年12月13日

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Arxiv

3+阅读 · 2018年12月7日

3D-LaneNet: end-to-end 3D multiple lane detection

3D-LaneNet: end-to-end 3D multiple lane detection

Arxiv

7+阅读 · 2018年11月26日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv

3+阅读 · 2018年3月16日

Improving Multiple Object Tracking with Optical Flow and Edge Preprocessing

Arxiv

10+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员