Avatar refers to a representative of a physical user in the virtual world that can engage in different activities and interact with other objects in metaverse. Simulating the avatar requires accurate human pose estimation. Though camera-based solutions yield remarkable performance, they encounter the privacy issue and degraded performance caused by varying illumination, especially in smart home. In this paper, we propose a WiFi-based IoT-enabled human pose estimation scheme for metaverse avatar simulation, namely MetaFi. Specifically, a deep neural network is designed with customized convolutional layers and residual blocks to map the channel state information to human pose landmarks. It is enforced to learn the annotations from the accurate computer vision model, thus achieving cross-modal supervision. WiFi is ubiquitous and robust to illumination, making it a feasible solution for avatar applications in smart home. The experiments are conducted in the real world, and the results show that the MetaFi achieves very high performance with a PCK@50 of 95.23%.
翻译:Avatar 指的是虚拟世界中能够从事不同活动和与其他天体在逆向中互动的物理用户的代表。 模拟动画需要准确的人形估计。 尽管基于相机的解决方案产生了显著的性能, 但它们遇到了隐私问题, 并且由于不同照明, 特别是在智能家庭, 导致性能退化。 在本文中, 我们提议了基于 WiFi 的基于 IoT 的元变形模拟人形估计方案, 即 MetAFi 。 具体地说, 设计了一个深神经网络, 配有定制的脉冲层和剩余区块, 以绘制频道向人造形标志的状态信息图。 它被强制用于从准确的计算机视觉模型中学习注释, 从而实现跨模式的监督。 WiFi 是无处不在的, 并且是光源家庭应用的可行解决方案。 实验在现实世界中进行, 结果显示 MetFi 以95. 23%的 PCK@50 实现极高的性能 。