Multi-baseline Synthetic Aperture Radar (SAR) three-dimensional (3D) tomography is a crucial remote sensing technique that provides 3D resolution unavailable in conventional SAR imaging. However, achieving high-quality imaging typically requires multi-angle or full-aperture data, resulting in significant imaging costs. Recent advancements in sparse 3D SAR, which rely on data from limited apertures, have gained attention as a cost-effective alternative. Notably, deep learning techniques have markedly enhanced the imaging quality of sparse 3D SAR. Despite these advancements, existing methods primarily depend on high-resolution radar images for supervising the training of deep neural networks (DNNs). This exclusive dependence on single-modal data prevents the introduction of complementary information from other data sources, limiting further improvements in imaging performance. In this paper, we introduce a Cross-Modal 3D-SAR Reconstruction Network (CMAR-Net) to enhance 3D SAR imaging by integrating heterogeneous information. Leveraging cross-modal supervision from 2D optical images and error transfer guaranteed by differentiable rendering, CMAR-Net achieves efficient training and reconstructs highly sparse multi-baseline SAR data into visually structured and accurate 3D images, particularly for vehicle targets. Extensive experiments on simulated and real-world datasets demonstrate that CMAR-Net significantly outperforms SOTA sparse reconstruction algorithms based on compressed sensing (CS) and deep learning (DL). Furthermore, our method eliminates the need for time-consuming full-aperture data preprocessing and relies solely on computer-rendered optical images, significantly reducing dataset construction costs. This work highlights the potential of deep learning for multi-baseline SAR 3D imaging and introduces a novel framework for radar imaging research through cross-modal learning.
翻译:暂无翻译