The numerical wavefront backpropagation principle of digital holography confers unique extended focus capabilities, without mechanical displacements along z-axis. However, the determination of the correct focusing distance is a non-trivial and time consuming issue. A deep learning (DL) solution is proposed to cast the autofocusing as a regression problem and tested over both experimental and simulated holograms. Single wavelength digital holograms were recorded by a Digital Holographic Microscope (DHM) with a 10$\mathrm{x}$ microscope objective from a patterned target moving in 3D over an axial range of 92 $\mu$m. Tiny DL models are proposed and compared such as a tiny Vision Transformer (TViT), tiny VGG16 (TVGG) and a tiny Swin-Transfomer (TSwinT). The proposed tiny networks are compared with their original versions (ViT/B16, VGG16 and Swin-Transformer Tiny) and the main neural networks used in digital holography such as LeNet and AlexNet. The experiments show that the predicted focusing distance $Z_R^{\mathrm{Pred}}$ is accurately inferred with an accuracy of 1.2 $\mu$m in average in comparison with the DHM depth of field of 15 $\mu$m. Numerical simulations show that all tiny models give the $Z_R^{\mathrm{Pred}}$ with an error below 0.3 $\mu$m. Such a prospect would significantly improve the current capabilities of computer vision position sensing in applications such as 3D microscopy for life sciences or micro-robotics. Moreover, all models reach an inference time on CPU, inferior to 25 ms per inference. In terms of occlusions, TViT based on its Transformer architecture is the most robust.
翻译:数字全息学的数字波浪平面回映原则赋予了独特的扩大焦点能力,没有在 z- 轴上机械置换。 然而, 确定正确聚焦距离是一个非三角和耗时的问题。 提出一个深层次的学习( DL) 解决方案, 将自动聚焦作为一个回归问题, 并在实验和模拟全息图中测试。 单波长数字全色全景( DHM) 记录了10$\mathr{x}$的显微镜目标, 从3D 模式目标移动到 92 $/mu$ 的轴向范围。 提议了一个不小的 DLM 模型, 并且比较了微小的视野变异( TGGG16) 和小的Swin- Transfolform( TSwinWINGT) 。 拟议的小网络与原版本( ViT/B16, VGG16 和 Swin- Try- Transfer Tiy) 相比, 在LNet 和 Alex- milleral- mold- mexalal roal roal roal roal roupal rol) 上, modeal deal mexal mexal mexlational $.