按需传输：面向视觉信息的任务自适应语义通信 (Transmit What You Need: Task-Adaptive Semantic Communications for Visual Information)

Recently, semantic communications have drawn great attention as the groundbreaking concept surpasses the limited capacity of Shannon's theory. Specifically, semantic communications probably become crucial in realizing visual tasks that demand massive network traffic. Although highly distinctive forms of visual semantics exist for computer vision tasks, a thorough investigation of what visual semantics can be transmitted in time and which one is required for completing different visual tasks has not yet been reported. To this end, we first scrutinize the achievable throughput in transmitting existing visual semantics through the limited wireless communication bandwidth. In addition, we further demonstrate the resulting performance of various visual tasks for each visual semantic. Based on the empirical testing, we suggest a task-adaptive selection of visual semantics is crucial for real-time semantic communications for visual tasks, where we transmit basic semantics (e.g., objects in the given image) for simple visual tasks, such as classification, and richer semantics (e.g., scene graphs) for complex tasks, such as image regeneration. To further improve transmission efficiency, we suggest a filtering method for scene graphs, which drops redundant information in the scene graph, thus allowing the sending of essential semantics for completing the given task. We confirm the efficacy of our task-adaptive semantic communication approach through extensive simulations in wireless channels, showing more than 45 times larger throughput over a naive transmission of original data. Our work can be reproduced at the following source codes: https://github.com/jhpark2024/jhpark.github.io

翻译：近年来，语义通信因其突破香农理论容量限制的开创性概念而受到极大关注。具体而言，语义通信可能在实现需要海量网络流量的视觉任务中变得至关重要。尽管针对计算机视觉任务存在高度差异化的视觉语义形式，但关于哪些视觉语义能够及时传输以及完成不同视觉任务需要何种语义，目前尚未有系统性的研究报道。为此，我们首先深入分析了在有限无线通信带宽下传输现有视觉语义所能达到的吞吐量。此外，我们进一步展示了每种视觉语义在各类视觉任务中的性能表现。基于实证测试，我们提出视觉语义的任务自适应选择对于视觉任务的实时语义通信至关重要：对于简单视觉任务（如图像分类），我们传输基础语义（如图像中的物体）；对于复杂任务（如图像重建），则传输更丰富的语义（如场景图）。为进一步提升传输效率，我们提出一种场景图过滤方法，该方法可丢弃场景图中的冗余信息，从而仅发送完成给定任务所需的关键语义。通过在无线信道中的大量仿真实验，我们验证了所提任务自适应语义通信方法的有效性，其吞吐量相比原始数据的直接传输提升了超过45倍。本工作的源代码可在以下链接获取：https://github.com/jhpark2024/jhpark.github.io