大型多模态模型赋能的面向任务自主通信：设计方法与实现挑战 (Large Multimodal Models-Empowered Task-Oriented Autonomous Communications: Design Methodology and Implementation Challenges)

Large language models (LLMs) and large multimodal models (LMMs) have achieved unprecedented breakthrough, showcasing remarkable capabilities in natural language understanding, generation, and complex reasoning. This transformative potential has positioned them as key enablers for 6G autonomous communications among machines, vehicles, and humanoids. In this article, we provide an overview of task-oriented autonomous communications with LLMs/LMMs, focusing on multimodal sensing integration, adaptive reconfiguration, and prompt/fine-tuning strategies for wireless tasks. We demonstrate the framework through three case studies: LMM-based traffic control, LLM-based robot scheduling, and LMM-based environment-aware channel estimation. From experimental results, we show that the proposed LLM/LMM-aided autonomous systems significantly outperform conventional and discriminative deep learning (DL) model-based techniques, maintaining robustness under dynamic objectives, varying input parameters, and heterogeneous multimodal conditions where conventional static optimization degrades.

翻译：大型语言模型（LLM）与大型多模态模型（LMM）已取得前所未有的突破，在自然语言理解、生成及复杂推理方面展现出卓越能力。这种变革性潜力使其成为6G网络中机器、车辆与人形机器人间自主通信的关键使能技术。本文概述了基于LLM/LMM的面向任务自主通信，重点探讨多模态感知融合、自适应重构以及面向无线任务的提示/微调策略。我们通过三个案例研究展示该框架：基于LMM的交通控制、基于LLM的机器人调度以及基于LMM的环境感知信道估计。实验结果表明，所提出的LLM/LMM辅助自主系统显著优于传统方法和基于判别式深度学习（DL）模型的技术，在动态目标、变化输入参数以及异构多模态条件下保持鲁棒性，而传统静态优化方法在这些条件下性能会下降。