In our past few years' of commercial deployment experiences, we identify localization as a critical task in autonomous machine applications, and a great acceleration target. In this paper, based on the observation that the visual frontend is a major performance and energy consumption bottleneck, we present our design and implementation of an energy-efficient hardware architecture for ORB (Oriented-Fast and Rotated- BRIEF) based localization system on FPGAs. To support our multi-sensor autonomous machine localization system, we present hardware synchronization, frame-multiplexing, and parallelization techniques, which are integrated in our design. Compared to Nvidia TX1 and Intel i7, our FPGA-based implementation achieves 5.6x and 3.4x speedup, as well as 3.0x and 34.6x power reduction, respectively.
翻译:在过去几年的商业部署经验中,我们把本地化确定为自主机器应用的关键任务和巨大的加速目标。 在本文件中,我们根据视觉前端是一个主要性能和能源消耗瓶颈的观察,介绍了我们在FPGAs上基于ORB(定向快速和旋转-BRIEF)本地化系统的节能硬件结构的设计和实施。为了支持我们的多传感器自主机器本地化系统,我们提出了硬件同步、框架多倍化和平行化技术,这些技术已被纳入我们的设计之中。与Nvidia TX1和Intel i7相比,我们基于FPGA的执行分别实现了5.6x和3.4x加速,以及3.0x和34.6x电力削减。