Depth Estimation has wide reaching applications in the field of Computer vision such as target tracking, augmented reality, and self-driving cars. The goal of Monocular Depth Estimation is to predict the depth map, given a 2D monocular RGB image as input. The traditional depth estimation methods are based on depth cues and used concepts like epipolar geometry. With the evolution of Convolutional Neural Networks, depth estimation has undergone tremendous strides. In this project, our aim is to explore possible extensions to existing SoTA Deep Learning based Depth Estimation Models and to see whether performance metrics could be further improved. In a broader sense, we are looking at the possibility of implementing Pose Estimation, Efficient Sub-Pixel Convolution Interpolation, Semantic Segmentation Estimation techniques to further enhance our proposed architecture and to provide fine-grained and more globally coherent depth map predictions. We also plan to do away with camera intrinsic parameters during training and apply weather augmentations to further generalize our model.
翻译:深度估计在计算机视觉领域中有广泛的应用,如目标跟踪、增强现实和无人驾驶汽车。单目深度估计的目标是在给定 2D 单目 RGB 图像作为输入的情况下预测深度图。传统的深度估计方法基于深度线索,并使用诸如极线几何的概念。随着卷积神经网络的发展,深度估计已经取得了巨大的进展。在这个项目中,我们的目标是探索现有领先深度估计模型的可能扩展,并看看是否能进一步提高性能指标。在更广泛的意义上,我们正在寻找实现姿态估计、高效次像素卷积插值、语义分割估计技术以进一步增强我们提出的架构,并提供精细和更全局一致的深度图预测的可能性。我们还计划在训练期间放弃相机内部参数,并应用天气增强方法进一步推广我们的模型。