3D Gaussian Splatting (3DGS) has recently emerged as a foundational technique for real-time neural rendering, 3D scene generation, volumetric video (4D) capture. However, its rendering and training impose massive computation, making real-time rendering on edge devices and real-time 4D reconstruction on workstations currently infeasible. Given its fixed-function nature and similarity with traditional rasterization, 3DGS presents a strong case for dedicated hardware in the graphics pipeline of next-generation GPUs. This work, Vorion, presents the first GPGPU prototype with hardware-accelerated 3DGS rendering and training. Vorion features scalable architecture, minimal hardware change to traditional rasterizers, z-tiling to increase parallelism, and Gaussian/pixel-centric hybrid dataflow. We prototype the minimal system (8 SIMT cores, 2 Gaussian rasterizer) using TSMC 16nm FinFET technology, which achieves 19 FPS for rendering. The scaled design with 16 rasterizers achieves 38.6 iterations/s for training.
翻译:3D高斯泼溅(3DGS)近期已成为实时神经渲染、3D场景生成及体视频(4D)捕获的基础性技术。然而,其渲染与训练过程需要海量计算,导致在边缘设备上实现实时渲染及在工作站上实现实时4D重建目前均不可行。鉴于其固定功能特性与传统光栅化技术的相似性,3DGS为下一代GPU图形流水线中的专用硬件提供了强有力的应用场景。本研究提出的Vorion,首次实现了具备硬件加速3DGS渲染与训练功能的通用图形处理器原型。Vorion采用可扩展架构,对传统光栅化硬件改动极小,通过z分块技术提升并行度,并融合高斯中心与像素中心的混合数据流。我们基于台积电16纳米鳍式场效应晶体管工艺构建了最小系统原型(含8个SIMT核心、2个高斯光栅化单元),其渲染性能达到19帧/秒。扩展至16个光栅化单元的设计方案在训练任务中可实现每秒38.6次迭代。