Multi-layer perceptrons (MLP) have proven to be effective scene encoders when combined with higher-dimensional projections of the input, commonly referred to as \textit{positional encoding}. However, scenes with a wide frequency spectrum remain a challenge: choosing high frequencies for positional encoding introduces noise in low structure areas, while low frequencies result in poor fitting of detailed regions. To address this, we propose a progressive positional encoding, exposing a hierarchical MLP structure to incremental sets of frequency encodings. Our model accurately reconstructs scenes with wide frequency bands and learns a scene representation at progressive level of detail \textit{without explicit per-level supervision}. The architecture is modular: each level encodes a continuous implicit representation that can be leveraged separately for its respective resolution, meaning a smaller network for coarser reconstructions. Experiments on several 2D and 3D datasets show improvements in reconstruction accuracy, representational capacity and training speed compared to baselines.
翻译:多层透视器(MLP)被证明是有效的场景编码器,如果结合输入的更高层面的预测,通常称为\ textit{posial convention}。但是,具有广频谱的场景仍是一个挑战:选择高频率用于定位编码,在低结构区域引入噪音,而低频率则导致详细区域不适宜。为此,我们建议采用渐进位置编码,将等级的 MLP 结构暴露在增量的频率编码组合中。我们的模型精确地用宽频带重建场景,并学习在渐进层面的详细\ textit{没有明确的每个级别监督}的场景显示。结构是模块化的:每个级别将连续的隐含代表编码成一个连续的暗号,可以单独用于各自的分辨率,这意味着一个小的粗体重建网络。关于多个 2D 和 3D 数据集的实验表明,重建精确度、 代表能力和培训速度与基线相比有所改进。