Depth-from-focus (DFF) is a technique that infers depth using the focus change of a camera. In this work, we propose a convolutional neural network (CNN) to find the best-focused pixels in a focal stack and infer depth from the focus estimation. The key innovation of the network is the novel deep differential focus volume (DFV). By computing the first-order derivative with the stacked features over different focal distances, DFV is able to capture both the focus and context information for focus analysis. Besides, we also introduce a probability regression mechanism for focus estimation to handle sparsely sampled focal stacks and provide uncertainty estimation to the final prediction. Comprehensive experiments demonstrate that the proposed model achieves state-of-the-art performance on multiple datasets with good generalizability and fast speed.
翻译:深度从焦点( DFF) 是一种使用相机焦点变化来推断深度的方法。 在这项工作中,我们提议建立一个进化神经网络(CNN), 以寻找焦点堆中最集中的像素, 并从焦点估计中推断出最深的像素。 网络的关键创新是新颖的深度差异焦点体积( DFV ) 。 通过在不同的焦点距离上计算带有堆叠特征的第一阶衍生物, DFV 能够同时捕捉焦点和背景信息, 以便进行焦点分析。 此外, 我们还引入了聚焦估计的概率回归机制, 用于处理稀有抽样的焦点堆, 并为最终预测提供不确定性估计。 全面实验表明, 拟议的模型在多个数据集上取得了最先进的性能, 具有良好的通用性和快速性。