We present PEARL (Peer-Enhanced Adaptive Radio via On-Device LLM), a framework for cooperative cross-layer optimization in device-to-device (D2D) communication. Building on our previous work on single-device on-device LLMs, PEARL extends the paradigm by leveraging both publisher and subscriber states to guide Wi-Fi Aware (WA) parameter selection. A context-aware reward, which normalizes latency by application tolerances and modulates energy by device battery states, provides richer supervision for KL-based finetuning. We study two lightweight variants: PEARL (Head + Low-Rank Adaptation (LoRA)) achieves the best overall performance, while PEARL-Lite (Head-only) delivers sub-20 ms inference at near-identical objective scores. Across synthetic scenarios grounded in real measurements, PEARL improves objective scores over heuristic and compact model baselines and reduces energy by up to 16% in cooperative low-battery cases. These results demonstrate that peer-aware context, reward-aligned training, and head-based efficiency make LLMs practical for always-on, on-device cross-layer control. Code, real-world demo, and dataset are available at https://github.com/abman23/pearl
翻译:本文提出PEARL(基于设备端大语言模型的同伴增强自适应无线电)框架,用于实现设备间(D2D)通信的协同跨层优化。基于我们先前在单设备端大语言模型(LLM)上的研究工作,PEARL通过同时利用发布者和订阅者状态来指导Wi-Fi Aware(WA)参数选择,从而扩展了该范式。一种上下文感知的奖励机制——通过应用容忍度对延迟进行归一化,并根据设备电池状态调整能耗——为基于KL散度的微调提供了更丰富的监督信号。我们研究了两种轻量级变体:PEARL(头部+低秩适应(LoRA))实现了最佳整体性能,而PEARL-Lite(仅头部)在目标分数几乎相同的情况下实现了低于20毫秒的推理延迟。在基于真实测量数据的合成场景中,PEARL相比启发式和紧凑模型基线提高了目标分数,并在协作低电量场景下将能耗降低了高达16%。这些结果表明,同伴感知的上下文、奖励对齐的训练以及基于头部的效率设计,使得大语言模型能够实际应用于持续运行的设备端跨层控制。代码、真实世界演示及数据集可在https://github.com/abman23/pearl获取。