分位数渲染：在3D高斯泼溅中高效嵌入高维特征 (Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting)

Recent advancements in computer vision have successfully extended Open-vocabulary segmentation (OVS) to the 3D domain by leveraging 3D Gaussian Splatting (3D-GS). Despite this progress, efficiently rendering the high-dimensional features required for open-vocabulary queries poses a significant challenge. Existing methods employ codebooks or feature compression, causing information loss, thereby degrading segmentation quality. To address this limitation, we introduce Quantile Rendering (Q-Render), a novel rendering strategy for 3D Gaussians that efficiently handles high-dimensional features while maintaining high fidelity. Unlike conventional volume rendering, which densely samples all 3D Gaussians intersecting each ray, Q-Render sparsely samples only those with dominant influence along the ray. By integrating Q-Render into a generalizable 3D neural network, we also propose Gaussian Splatting Network (GS-Net), which predicts Gaussian features in a generalizable manner. Extensive experiments on ScanNet and LeRF demonstrate that our framework outperforms state-of-the-art methods, while enabling real-time rendering with an approximate ~43.7x speedup on 512-D feature maps. Code will be made publicly available.

翻译：近期计算机视觉领域的进展通过利用3D高斯泼溅（3D-GS）成功将开放词汇分割（OVS）扩展至三维领域。尽管取得这一进展，如何高效渲染开放词汇查询所需的高维特征仍面临重大挑战。现有方法采用码本或特征压缩技术，导致信息损失，从而降低分割质量。为突破此限制，我们提出了分位数渲染（Q-Render）——一种面向3D高斯泼溅的新型渲染策略，能在保持高保真度的同时高效处理高维特征。与传统体渲染需对每条射线相交的所有3D高斯分布进行密集采样不同，Q-Render仅对沿射线方向具有主导影响的高斯分布进行稀疏采样。通过将Q-Render集成至可泛化的三维神经网络，我们同时提出了高斯泼溅网络（GS-Net），该网络能以可泛化方式预测高斯特征。在ScanNet和LeRF数据集上的大量实验表明，我们的框架在512维特征图上实现约43.7倍加速的实时渲染的同时，性能优于现有最优方法。代码将公开发布。

相关内容