Neural Radiance Field (NeRF) based rendering has attracted growing attention thanks to its state-of-the-art (SOTA) rendering quality and wide applications in Augmented and Virtual Reality (AR/VR). However, immersive real-time (> 30 FPS) NeRF based rendering enabled interactions are still limited due to the low achievable throughput on AR/VR devices. To this end, we first profile SOTA efficient NeRF algorithms on commercial devices and identify two primary causes of the aforementioned inefficiency: (1) the uniform point sampling and (2) the dense accesses and computations of the required embeddings in NeRF. Furthermore, we propose RT-NeRF, which to the best of our knowledge is the first algorithm-hardware co-design acceleration of NeRF. Specifically, on the algorithm level, RT-NeRF integrates an efficient rendering pipeline for largely alleviating the inefficiency due to the commonly adopted uniform point sampling method in NeRF by directly computing the geometry of pre-existing points. Additionally, RT-NeRF leverages a coarse-grained view-dependent computing ordering scheme for eliminating the (unnecessary) processing of invisible points. On the hardware level, our proposed RT-NeRF accelerator (1) adopts a hybrid encoding scheme to adaptively switch between a bitmap- or coordinate-based sparsity encoding format for NeRF's sparse embeddings, aiming to maximize the storage savings and thus reduce the required DRAM accesses while supporting efficient NeRF decoding; and (2) integrates both a dual-purpose bi-direction adder & search tree and a high-density sparse search unit to coordinate the two aforementioned encoding formats. Extensive experiments on eight datasets consistently validate the effectiveness of RT-NeRF, achieving a large throughput improvement (e.g., 9.7x - 3,201x) while maintaining the rendering quality as compared with SOTA efficient NeRF solutions.
翻译:以 Neoral radiance Field (NeRF) 为主的投影工作由于它的先进工艺(SOTA)在增强和虚拟现实(AR/VR)中提供质量和广泛应用而引起越来越多的关注。然而,由于AR/VR 装置的可实现输送量较低,基于 NeRF 的投影功能仍然有限。为此,我们首先在商业设备上配置SOTA高效 NeRF 算法,并找出上述效率低下的两个主要原因:(1) 统一点的采样,以及(2) NeRF 中所需的嵌入存储器的密集存取和计算。此外,我们建议RT-NERF,我们最了解的是 RT-NERF 的最小实时实时( > 30 FPS) 联合设计加速。具体地说,RT- NERF 将高效的管道整合,以大大降低效率,因为NRF 通用的点采样方法,直接计算前几个点的测算法。此外,RT-NRF 将精选的精选用于支持内置的内置的内置的存储存储存储存储存储系统,同时将一个直径对立的升级的平调平调方案, 并同时将一个直置的软调制成一个直置的硬化的存储器,将一个双对调制式的硬调制式的内。