Data imputation is a prevalent and important task due to the ubiquitousness of missing data. Many efforts try to first draft a completed data and second refine to derive the imputation results, or "draft-then-refine" for short. In this work, we analyze this widespread practice from the perspective of Dirichlet energy. We find that a rudimentary "draft" imputation will decrease the Dirichlet energy, thus an energy-maintenance "refine" step is in need to recover the overall energy. Since existing "refine" methods such as Graph Convolutional Network (GCN) tend to cause further energy decline, in this work, we propose a novel framework called Graph Laplacian Pyramid Network (GLPN) to preserve Dirichlet energy and improve imputation performance. GLPN consists of a U-shaped autoencoder and residual networks to capture global and local detailed information respectively. By extensive experiments on several real-world datasets, GLPN shows superior performance over state-of-the-art methods under three different missing mechanisms. Our source code is available at https://github.com/liguanlue/GLPN.
翻译:数据插补是一项广泛而重要的任务,因为缺失数据无处不在。许多尝试首先起草完整的数据,然后再进行细化以得出插补结果,或简称为“起草-细化”。在这项工作中,我们从Dirichlet能量的角度分析这种广泛的实践。 我们发现,一个基本的“起草”插补将减少Dirichlet能量,因此需要一步能量维护的“细化”步骤来恢复整体能量。 由于现有的“细化”方法如图形卷积网络(GCN)倾向于导致进一步的能量下降,在这项工作中,我们提出了一种新颖的框架,称为图拉普拉斯金字塔网络(GLPN),以保留Dirichlet能量并提高插补性能。 GLPN由一个U形自编码器和残差网络组成,以捕捉全局和局部详细信息。通过对几个真实世界数据集的广泛实验,GLPN在三种不同的缺失机制下显示出优越的性能。我们的源代码可在https://github.com/liguanlue/GLPN上找到。