Representing visual signals by coordinate-based deep fully-connected networks has been shown advantageous in fitting complex details and solving inverse problems than discrete grid-based representation. However, acquiring such a continuous Implicit Neural Representation (INR) requires tedious per-scene training on tons of signal measurements, which limits its practicality. In this paper, we present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) from a data collection and representing INR as a functional combination of basis sampled from the dictionary. Our NID assembles a group of coordinate-based subnetworks which are tuned to span the desired function space. After training, one can instantly and robustly acquire an unseen scene representation by solving the coding coefficients. To parallelly optimize a large group of networks, we borrow the idea from Mixture-of-Expert (MoE) to design and train our network with a sparse gating mechanism. Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data. We further demonstrate various applications of NID in image inpainting and occlusion removal, which are considered to be challenging with vanilla INR. Our codes are available in https://github.com/VITA-Group/Neural-Implicit-Dict.
翻译:通过基于协调的深层连接的网络代表视觉信号,在与离散的网基代表系统相比,在设计复杂细节和解决反问题方面被证明是有利的。然而,获得这种连续的隐性神经代表系统(INR)需要就信号测量吨数进行乏味的单层培训,这限制了其实用性。在本文件中,我们提出了一个通用的IRR框架,通过从数据收集中学习神经隐性词典(NID)来实现数据和培训效率,并代表IRR作为从字典抽取的基础的功能组合。我们的NID组一组基于协调的子网络,它们经过调整,可以跨越理想的功能空间。经过培训后,人们可以立即和有力地获得一个隐蔽的场面代表系统,解决其实用性。为了同时优化一大批网络,我们从Mixture-Exterert(MoE)那里借用一个概念来设计和培训我们的网络,用一个稀疏漏的格机制。我们的实验显示,NID可以改进2D图像或3D图像的重建,以2级的速度调制,将超过98%的NID/BIBILL的升级应用是具有挑战性的数据。我们在图像中进一步展示中可以显示,在清除/直观中,在清除/直观中,在移动中,我们使用的应用应用中可以进一步显示。我们对立中可以演示的各种应用。