ReS2tAC -- -- 对嵌入式ARM和CUDA装置的压强性压强性压强性压强性压强性SGM (ReS2tAC -- UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices) - 专知论文

会员服务 ·

0

Processing（编程语言） · 优化器 · ARM · CUDA · NVIDIA Tegra ·

2021 年 6 月 15 日

ReS2tAC -- UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

翻译：ReS2tAC -- -- 对嵌入式ARM和CUDA装置的压强性压强性压强性压强性压强性SGM

Boitumelo Ruf,Jonas Mohrs,Martin Weinmann,Stefan Hinz,Jürgen Beyerer

With the emergence of low-cost robotic systems, such as unmanned aerial vehicle, the importance of embedded high-performance image processing has increased. For a long time, FPGAs were the only processing hardware that were capable of high-performance computing, while at the same time preserving a low power consumption, essential for embedded systems. However, the recently increasing availability of embedded GPU-based systems, such as the NVIDIA Jetson series, comprised of an ARM CPU and a NVIDIA Tegra GPU, allows for massively parallel embedded computing on graphics hardware. With this in mind, we propose an approach for real-time embedded stereo processing on ARM and CUDA-enabled devices, which is based on the popular and widely used Semi-Global Matching algorithm. In this, we propose an optimization of the algorithm for embedded CUDA GPUs, by using massively parallel computing, as well as using the NEON intrinsics to optimize the algorithm for vectorized SIMD processing on embedded ARM CPUs. We have evaluated our approach with different configurations on two public stereo benchmark datasets to demonstrate that they can reach an error rate as low as 3.3%. Furthermore, our experiments show that the fastest configuration of our approach reaches up to 46 FPS on VGA image resolution. Finally, in a use-case specific qualitative evaluation, we have evaluated the power consumption of our approach and deployed it on the DJI Manifold 2-G attached to a DJI Matrix 210v2 RTK unmanned aerial vehicle (UAV), demonstrating its suitability for real-time stereo processing onboard a UAV.

翻译：随着无人驾驶飞行器等低成本机器人系统的出现,嵌入高性能图像处理的重要性有所提高。长期以来,FPGA系统是唯一能够高性能计算、同时保持低电耗的处理硬件,对嵌入系统至关重要。然而,最近,内嵌的GPU系统,如NVIDIA Jetson系列,由ARM CPU和NVIDIA Tegra GPU组成的内嵌GPU系统越来越多,因此可以在图形硬件上进行大规模平行的计算。考虑到这一点,我们建议对ARM和CUDA辅助设备进行实时嵌入立式立体处理,这是基于流行和广泛使用的半全球匹配算法。在此,我们建议优化嵌入的嵌入GPU的GPU系统,例如NVDIA Jetson系列,以及利用近地天体内嵌入的SIMD GPU的内置算法,使我们在两个公共立体基准数据集上采用不同配置的方法,以显示其真实的RIV 和DR(我们最终的RIA ) 的 Ral-S-S-S-S-I-I-I-IL 直压方法的准确度,我们最后的D-IA-S-IL-I-I-I-I-IL-S-IL-IL-S-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I

0

相关内容

Processing（编程语言）

Processing（编程语言）

Processing 是一门开源编程语言和与之配套的集成开发环境（IDE）的名称。Processing 在电子艺术和视觉设计社区被用来教授编程基础，并运用于大量的新媒体和互动艺术作品中。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

87+阅读 · 2020年5月11日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【泡泡一分钟】PIRVS：一个具有灵活传感器融合和硬件协同设计的先进视觉-惯性SLAM系统

【泡泡一分钟】PIRVS：一个具有灵活传感器融合和硬件协同设计的先进视觉-惯性SLAM系统

泡泡机器人SLAM

11+阅读 · 2019年9月11日

人脸检测库：libfacedetection

人脸检测库：libfacedetection

Python程序员

15+阅读 · 2019年3月22日

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

AI研习社

10+阅读 · 2019年3月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

泡泡机器人SLAM

11+阅读 · 2018年12月17日

【泡泡一分钟】尺度空间中具备渐进大尺度不变性的图像匹配

【泡泡一分钟】尺度空间中具备渐进大尺度不变性的图像匹配

泡泡机器人SLAM

12+阅读 · 2018年12月7日

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

泡泡机器人SLAM

29+阅读 · 2018年10月28日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Non-imaging real-time detection and tracking of fast-moving objects

Arxiv

0+阅读 · 2021年8月13日

Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation

Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation

Arxiv

0+阅读 · 2021年8月12日

Automatically Locating ARM Instructions Deviation between Real Devices and CPU Emulators

Arxiv

0+阅读 · 2021年8月12日

Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting

Arxiv

4+阅读 · 2020年12月14日

Real-time Scalable Dense Surfel Mapping

Real-time Scalable Dense Surfel Mapping

Arxiv

5+阅读 · 2019年9月10日

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Arxiv

3+阅读 · 2018年12月13日

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Arxiv

5+阅读 · 2018年9月13日

Receptive Field Block Net for Accurate and Fast Object Detection

Receptive Field Block Net for Accurate and Fast Object Detection

Arxiv

3+阅读 · 2018年7月26日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video

Arxiv

5+阅读 · 2017年9月18日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

87+阅读 · 2020年5月11日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

【泡泡一分钟】PIRVS：一个具有灵活传感器融合和硬件协同设计的先进视觉-惯性SLAM系统

【泡泡一分钟】PIRVS：一个具有灵活传感器融合和硬件协同设计的先进视觉-惯性SLAM系统

泡泡机器人SLAM

11+阅读 · 2019年9月11日

人脸检测库：libfacedetection

人脸检测库：libfacedetection

Python程序员

15+阅读 · 2019年3月22日

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

AI研习社

10+阅读 · 2019年3月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

泡泡机器人SLAM

11+阅读 · 2018年12月17日

【泡泡一分钟】尺度空间中具备渐进大尺度不变性的图像匹配

【泡泡一分钟】尺度空间中具备渐进大尺度不变性的图像匹配

泡泡机器人SLAM

12+阅读 · 2018年12月7日

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

泡泡机器人SLAM

29+阅读 · 2018年10月28日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Non-imaging real-time detection and tracking of fast-moving objects

Arxiv

0+阅读 · 2021年8月13日

Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation

Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation

Arxiv

0+阅读 · 2021年8月12日

Automatically Locating ARM Instructions Deviation between Real Devices and CPU Emulators

Arxiv

0+阅读 · 2021年8月12日

Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting

Arxiv

4+阅读 · 2020年12月14日

Real-time Scalable Dense Surfel Mapping

Real-time Scalable Dense Surfel Mapping

Arxiv

5+阅读 · 2019年9月10日

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Arxiv

3+阅读 · 2018年12月13日

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Arxiv

5+阅读 · 2018年9月13日

Receptive Field Block Net for Accurate and Fast Object Detection

Receptive Field Block Net for Accurate and Fast Object Detection

Arxiv

3+阅读 · 2018年7月26日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video

Arxiv

5+阅读 · 2017年9月18日

微信扫码咨询专知VIP会员