The coordination of robotic swarms and the remote wireless control of industrial systems are among the major use cases for 5G and beyond systems: in these cases, the massive amounts of sensory information that needs to be shared over the wireless medium can overload even high-capacity connections. Consequently, solving the effective communication problem by optimizing the transmission strategy to discard irrelevant information can provide a significant advantage, but is often a very complex task. In this work, we consider a prototypal system in which an observer must communicate its sensory data to an actor controlling a task (e.g., a mobile robot in a factory). We then model it as a remote Partially Observable Markov Decision Process (POMDP), considering the effect of adopting semantic and effective communication-oriented solutions on the overall system performance. We split the communication problem by considering an ensemble Vector Quantized Variational Autoencoder (VQ-VAE) encoding, and train a Deep Reinforcement Learning (DRL) agent to dynamically adapt the quantization level, considering both the current state of the environment and the memory of past messages. We tested the proposed approach on the well-known CartPole reference control problem, obtaining a significant performance increase over traditional approaches
翻译:协调工业系统的机器人群群和远程无线控制是5G系统和其他系统的主要使用案例之一:在这些情况下,无线介质上需要共享的大量感官信息可能超载甚至高容量连接。因此,通过优化传输战略,解决有效的通信问题,抛弃不相干的信息,可以带来很大的优势,但往往是一项非常复杂的任务。在这项工作中,我们考虑一种原生系统,观察员必须将其感官数据传递给控制任务的行为者(例如工厂中的移动机器人),然后将它模拟为远程部分可观测的Markov决策过程(POMDP),考虑采用语义化和有效的面向通信的解决方案对整个系统性能的影响。我们通过考虑混合的矢量 Q-VAE(VQ-VAE)编码,培训深强化学习(DRL)代理,以动态地调整复位水平,同时考虑到当前环境的参照状态和对过去重要信息记忆,我们测试了对过去遗留信息的重要控制方法。我们测试了拟议的绩效方法。