3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS). However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization capacity for broader pose variations. In this paper, we alleviate the overfitting problem by introducing a self-ensembling Gaussian Splatting (SE-GS) approach. We present two Gaussian Splatting models named the $\mathbf{\Sigma}$-model and the $\mathbf{\Delta}$-model. The $\mathbf{\Sigma}$-model serves as the primary model that generates novel-view images during inference. At the training stage, the $\mathbf{\Sigma}$-model is guided away from specific local optima by an uncertainty-aware perturbing strategy. We dynamically perturb the $\mathbf{\Delta}$-model based on the uncertainties of novel-view renderings across different training steps, resulting in diverse temporal models sampled from the Gaussian parameter space without additional training costs. The geometry of the $\mathbf{\Sigma}$-model is regularized by penalizing discrepancies between the $\mathbf{\Sigma}$-model and the temporal samples. Therefore, our SE-GS conducts an effective and efficient regularization across a large number of Gaussian Splatting models, resulting in a robust ensemble, the $\mathbf{\Sigma}$-model. Experimental results on the LLFF, Mip-NeRF360, DTU, and MVImgNet datasets show that our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods. The code is released at https://github.com/sailor-z/SE-GS.
翻译:暂无翻译