Multi-focus image fusion (MFIF) addresses the depth-of-field (DOF) limitations of optical lenses, where only objects within a specific range appear sharp. Although traditional and deep learning methods have advanced the field, challenges persist, including limited training data, domain gaps from synthetic datasets, and difficulties with regions lacking information. We propose VAEEDOF, a novel MFIF method that uses a distilled variational autoencoder for high-fidelity, efficient image reconstruction. Our fusion module processes up to seven images simultaneously, enabling robust fusion across diverse focus points. To address data scarcity, we introduce MattingMFIF, a new syntetic 4K dataset, simulating realistic DOF effects from real photographs. Our method achieves state-of-the-art results, generating seamless artifact-free fused images and bridging the gap between synthetic and real-world scenarios, offering a significant step forward in addressing complex MFIF challenges. The code, and weights are available here:
翻译:多焦点图像融合(MFIF)旨在克服光学镜头景深(DOF)的限制,即仅特定范围内的物体能清晰成像。尽管传统方法与深度学习方法已推动该领域发展,但仍存在训练数据有限、合成数据集导致的域间差异以及信息缺失区域处理困难等挑战。本文提出VAEEDOF,一种新颖的MFIF方法,其采用蒸馏变分自编码器实现高保真、高效的图像重建。我们的融合模块可同时处理多达七幅图像,实现对不同焦点区域的鲁棒融合。针对数据稀缺问题,我们构建了MattingMFIF——一个模拟真实照片景深效果的新型合成4K数据集。本方法取得了最先进的性能,能生成无缝、无伪影的融合图像,并弥合了合成场景与真实场景之间的差距,为应对复杂MFIF挑战提供了重要进展。代码与权重已公开于: