Mobile stereo-matching systems have become an important part of many applications, such as automated-driving vehicles and autonomous robots. Accurate stereo-matching methods usually lead to high computational complexity; however, mobile platforms have only limited hardware resources to keep their power consumption low; this makes it difficult to maintain both an acceptable processing speed and accuracy on mobile platforms. To resolve this trade-off, we herein propose a novel acceleration approach for the well-known zero-means normalized cross correlation (ZNCC) matching cost calculation algorithm on a Jetson Tx2 embedded GPU. In our method for accelerating ZNCC, target images are scanned in a zigzag fashion to efficiently reuse one pixel's computation for its neighboring pixels; this reduces the amount of data transmission and increases the utilization of on-chip registers, thus increasing the processing speed. As a result, our method is 2X faster than the traditional image scanning method, and 26% faster than the latest NCC method. By combining this technique with the domain transformation (DT) algorithm, our system show real-time processing speed of 32 fps, on a Jetson Tx2 GPU for 1,280x384 pixel images with a maximum disparity of 128. Additionally, the evaluation results on the KITTI 2015 benchmark show that our combined system is more accurate than the same algorithm combined with census by 7.26%, while maintaining almost the same processing speed.
翻译:移动立体配对系统已成为许多应用软件的重要部分,例如自动驾驶车辆和自主机器人。精确的立体配对方法通常导致计算复杂性高;然而,移动平台只有有限的硬件资源来保持电耗低;这使得难以在移动平台上保持可接受的处理速度和准确性;为解决这一权衡,我们在此建议对众所周知的零速标准交叉比对等(ZNCC)在杰特森三氧化二嵌入式GPU上匹配成本计算算法(ZNCC)采取新的加速方法。在我们加速ZNCC的方法中,目标图像以zigzag方式扫描,以便高效地再利用一个像素计算,以保持其相邻的像素;这就减少了数据传输量,增加了芯片登记册的利用率,从而提高了处理速度。因此,我们的方法比传统图像扫描法更快了2,比最新的NCC方法更快了26%。通过将这一技术与域变换算法相结合,我们的系统显示实时处理速度为32fsxx逻辑,同时在JEARIT Tx显示2015年水平的GPU结果上,而实际处理速度为32fx结果为1xx。