This paper introduces a novel framework for high-accuracy outdoor user equipment (UE) positioning that applies a conditional generative diffusion model directly to high-dimensional massive MIMO channel state information (CSI). Traditional fingerprinting methods struggle to scale to large, dynamic outdoor environments and require dense, impractical data surveys. To overcome these limitations, our approach learns a direct mapping from raw uplink Sounding Reference Signal (SRS) fingerprints to continuous geographic coordinates. We demonstrate that our DiffLoc framework achieves unprecedented sub-centimeter precision, with our best model (DiffLoc-CT) delivering 0.5 cm fusion accuracy and 1-2 cm single base station (BS) accuracy in a realistic, ray-traced Tokyo urban macro-cell environment. This represents an order-of-magnitude improvement over existing methods, including supervised regression approaches (over 10 m error) and grid-based fusion (3 m error). Our consistency training approach reduces inference time from 200 steps to just 2 steps while maintaining exceptional accuracy even for high-speed users (15-25 m/s) and unseen user trajectories, demonstrating the practical feasibility of our framework for real-time 6G applications.
翻译:本文提出了一种新型高精度室外用户设备定位框架,该框架将条件生成扩散模型直接应用于高维度大规模MIMO信道状态信息。传统指纹定位方法难以扩展至大规模动态室外环境,且需要密集而不切实际的数据采集。为克服这些限制,我们的方法从原始上行链路探测参考信号指纹中学习到连续地理坐标的直接映射。我们证明,DiffLoc框架在逼真的射线追踪东京城市宏小区环境中实现了前所未有的亚厘米级精度,其中最优模型(DiffLoc-CT)达到了0.5厘米的融合精度和1-2厘米的单基站精度。这相较于现有方法实现了数量级提升,包括监督回归方法(误差超过10米)和基于网格的融合方法(3米误差)。我们的一致性训练方法将推理步数从200步缩减至仅2步,同时即使对于高速用户(15-25米/秒)和未见过的用户轨迹仍能保持卓越精度,证明了该框架在实时6G应用中的实际可行性。