Autonomous machines (e.g., vehicles, mobile robots, drones) require sophisticated 3D mapping to perceive the dynamic environment. However, maintaining a real-time 3D map is expensive both in terms of compute and memory requirements, especially for resource-constrained edge machines. Probabilistic OctoMap is a reliable and memory-efficient 3D dense map model to represent the full environment, with dynamic voxel node pruning and expansion capacity. This paper presents the first efficient accelerator solution, i.e. OMU, to enable real-time probabilistic 3D mapping at the edge. To improve the performance, the input map voxels are updated via parallel PE units for data parallelism. Within each PE, the voxels are stored using a specially developed data structure in parallel memory banks. In addition, a pruning address manager is designed within each PE unit to reuse the pruned memory addresses. The proposed 3D mapping accelerator is implemented and evaluated using a commercial 12 nm technology. Compared to the ARM Cortex-A57 CPU in the Nvidia Jetson TX2 platform, the proposed accelerator achieves up to 62$\times$ performance and 708$\times$ energy efficiency improvement. Furthermore, the accelerator provides 63 FPS throughput, more than 2$\times$ higher than a real-time requirement, enabling real-time perception for 3D mapping.
翻译:自动机器(例如车辆、移动机器人、无人机)需要先进的3D地图来感知动态环境。然而,维持实时3D地图在计算和记忆要求方面费用昂贵,特别是资源限制边缘机器。概率奥克托马普是一个可靠和记忆高效的3D密集地图模型,可以代表整个环境,具有动态的 voxel 节点运行和扩展能力。本文展示了第一个高效加速器解决方案,即OMU,以便在边缘进行实时3D稳定映射。为了改进性能,输入的3D地图 voxels通过平行的 PE 设备更新,用于数据平行的边缘机器。在每一个 PE 中, voxels 都使用特别开发的数据结构存储在平行的存储库中。 此外,每个 PE 单位设计了一个运行地址管理员,用于再利用经处理的存储存储存储地址。 3D $$CMU, 3PE, 用于使用商业的12nm技术, 与AR Cortex-A$ 5708 和 NPO 更新的运行效率平台相比, NVD 7x。