Indirect Time-of-Flight (I-ToF) imaging is a widespread way of depth estimation for mobile devices due to its small size and affordable price. Previous works have mainly focused on quality improvement for I-ToF imaging especially curing the effect of Multi Path Interference (MPI). These investigations are typically done in specifically constrained scenarios at close distance, indoors and under little ambient light. Surprisingly little work has investigated I-ToF quality improvement in real-life scenarios where strong ambient light and far distances pose difficulties due to an extreme amount of induced shot noise and signal sparsity, caused by the attenuation with limited sensor power and light scattering. In this work, we propose a new learning based end-to-end depth prediction network which takes noisy raw I-ToF signals as well as an RGB image and fuses their latent representation based on a multi step approach involving both implicit and explicit alignment to predict a high quality long range depth map aligned to the RGB viewpoint. We test our approach on challenging real-world scenes and show more than 40% RMSE improvement on the final depth map compared to the baseline approach.
翻译:由于规模小,价格低廉,移动装置的深度估计方法很广。以前的工作主要侧重于I-ToF成像的质量改进,特别是解决多路干涉的影响。这些调查通常是在近距离、室内和小环境光线下特别受限制的情景下进行的。令人惊讶的是,很少有工作调查了I-ToF在现实生活情景中的质量改进情况,在这些情景中,由于传感器功率和光散布有限,造成极多发性诱发的噪音和信号散射,因此,强大的环境光线和远距离给实际生活情景带来极大困难。在这项工作中,我们提出一个新的基于终端至终端深度预测网络的学习,其中采用杂乱的I-TF信号,以及一个RGB图像,并在多步方法的基础上结合其潜在代表性,包括隐含和明确调整,以预测与RGB观点一致的高质量长距离深度地图。我们测试了挑战真实世界景点的方法,并显示与基线方法相比,最后深度地图上有40%以上的RME改进。