Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (>=24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing non-volatile memory in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (>=30%) owing to the small form factor of MRAM compared to traditional SRAM.
翻译:低功耗边缘AI能力对于支持Metaverse愿景的应用程序来说至关重要。在这项工作中,我们研究了两种代表性的XR工作负载:(i)手部检测和(ii)眼部分割,用于硬件设计空间探索。对于这两种应用程序,我们训练深度神经网络并分析量化和硬件特定瓶颈的影响。通过模拟,我们评估了CPU和两种脉动式推理加速器实现。接下来,我们将这些硬件解决方案与先进的技术节点进行比较。评估将最新的非挥发性存储器技术(STT/SOT/VGSOT MRAM)集成到XR-AI推理管道中的影响。我们发现,在7纳米节点下引入非挥发性存储器在手部检测(IPS=10)和眼部分割(IPS=0.1)的设计中可以实现显著的节能效益(>=24%),同时满足最小的IPS(每秒推理数)。此外,由于MRAM相比传统的SRAM具有小型尺寸的特点,我们可以实现大幅度的面积缩小(>=30%)。