Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (>=24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing non-volatile memory in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (>=30%) owing to the small form factor of MRAM compared to traditional SRAM.
翻译:在这项工作中,我们调查了两个具有代表性的 XR 工作量:(一) 手探测和(二) 眼分解,用于硬件设计空间探索。对于这两种应用,我们训练深神经网络,分析量化和硬件特定瓶颈的影响。通过模拟,我们评估一个CPU 和两个循环推导加速器的安装。接下来,我们将这些硬件解决方案与先进技术节点进行比较。将最新新兴的非挥发性存储技术(STT/SOT/VGSOT MRAM)纳入 XR-AI 推断管道的影响进行了评估。我们发现,通过将非挥发性记忆纳入7nnn node 设计的记忆层,可以实现手感应(IPS=10)和眼分解(IPS=0.1)的重大能源效益( ⁇ 24 % ) 。此外,我们还可以将传统磁带因SRAM(=30%)到磁带磁带的图像进行大幅降低。