This paper presents a novel approach for multimodal data fusion based on the Vector-Quantized Variational Autoencoder (VQVAE) architecture. The proposed method is simple yet effective in achieving excellent reconstruction performance on paired MNIST-SVHN data and WiFi spectrogram data. Additionally, the multimodal VQVAE model is extended to the 5G communication scenario, where an end-to-end Channel State Information (CSI) feedback system is implemented to compress data transmitted between the base-station (eNodeB) and User Equipment (UE), without significant loss of performance. The proposed model learns a discriminative compressed feature space for various types of input data (CSI, spectrograms, natural images, etc), making it a suitable solution for applications with limited computational resources.
翻译:本文介绍了基于矢量量化自动编码器(VQVAE)结构的多式联运数据融合新颖办法,拟议的方法简单而有效,在对齐MNIST-SVHN数据和WiFi光谱数据方面实现了出色的重建性能,此外,多式VQVAE模式扩大到5G通信情景,即实施端对端频道国家信息反馈系统,以压缩基础站(eNodeB)和用户设备(UE)之间传输的数据,而没有显著的性能损失。提议的模型学习了各种输入数据(CSI、光谱、自然图像等)的歧视性压缩特征空间,从而成为计算资源有限的应用的合适解决方案。</s>