Effective data imputation demands rich latent ``structure" discovery capabilities from ``plain" tabular data. Recent advances in graph neural networks-based data imputation solutions show their strong structure learning potential by directly translating tabular data as bipartite graphs. However, due to a lack of relations between samples, those solutions treat all samples equally which is against one important observation: ``similar sample should give more information about missing values." This paper presents a novel Iterative graph Generation and Reconstruction framework for Missing data imputation(IGRM). Instead of treating all samples equally, we introduce the concept: ``friend networks" to represent different relations among samples. To generate an accurate friend network with missing data, an end-to-end friend network reconstruction solution is designed to allow for continuous friend network optimization during imputation learning. The representation of the optimized friend network, in turn, is used to further optimize the data imputation process with differentiated message passing. Experiment results on eight benchmark datasets show that IGRM yields 39.13% lower mean absolute error compared with nine baselines and 9.04% lower than the second-best.
翻译:有效的数据估算要求“ plain” 表单数据中具有丰富的潜在“ 结构” 发现能力。 图表神经网络数据估算解决方案的最新进展表明,它们通过直接将表格数据转换为双片图形而具有强大的结构学习潜力。 但是,由于样本之间缺乏关系,这些解决方案对所有样本一视同仁,而这与一项重要观察意见是相对的: “ 相似的样本应该提供更多关于缺失值的信息 ” 。 本文展示了一个新的“ 缺失数据估算的循环图形生成和重建框架 ” 。 我们没有同等对待所有样本,而是引入了概念: “ 朋友网络” 来代表样本之间的不同关系。 为了产生缺少数据的准确朋友网络,设计了一个端对端朋友网络重建解决方案,以便在估算学习过程中实现连续的朋友网络优化。 而优化的友网的表述又被用来进一步优化数据估算过程,传递了不同的信息。 八个基准数据集的实验结果表明,IGRM 生成的绝对误差比9个基线低39.13%,比9.4%低。