Imputation of missing data is a task that plays a vital role in a number of engineering and science applications. Often such missing data arise in experimental observations from limitations of sensors or post-processing transformation errors. Other times they arise from numerical and algorithmic constraints in computer simulations. One such instance and the application emphasis of this paper are numerical simulations of storm surge. The simulation data corresponds to time-series surge predictions over a number of save points within the geographic domain of interest, creating a spatio-temporal imputation problem where the surge points are heavily correlated spatially and temporally, and the missing values regions are structurally distributed at random. Very recently, machine learning techniques such as neural network methods have been developed and employed for missing data imputation tasks. Generative Adversarial Nets (GANs) and GAN-based techniques have particularly attracted attention as unsupervised machine learning methods. In this study, the Generative Adversarial Imputation Nets (GAIN) performance is improved by applying convolutional neural networks instead of fully connected layers to better capture the correlation of data and promote learning from the adjacent surge points. Another adjustment to the method needed specifically for the studied data is to consider the coordinates of the points as additional features to provide the model more information through the convolutional layers. We name our proposed method as Convolutional Generative Adversarial Imputation Nets (Conv-GAIN). The proposed method's performance by considering the improvements and adaptations required for the storm surge data is assessed and compared to the original GAIN and a few other techniques. The results show that Conv-GAIN has better performance than the alternative methods on the studied data.
翻译:缺少的数据的估算是一项任务,在许多工程和科学应用中发挥着关键作用。这种缺失的数据往往产生于传感器或后处理转换错误的局限性或后转换错误的实验性观测中。在计算机模拟中,由于数字和算法的限制而出现的其他时间,它们也产生于数字和算法的限制。本文的一个实例和应用重点是风暴潮的数值模拟。模拟数据与在地理感兴趣范围内一些保存点上的时间序列激增预测相对应,在这种情况下,激增点在空间和时间上高度相对应,而缺失的数值区域则在结构上随机分布。最近,已经开发了神经网络方法等机器学习技术,用于缺失的数据估算任务。GAN(GANs)和GAN(GAN)技术特别吸引了人们的注意力,作为不受监督的机器学习方法。在本研究中,GENVIL(G)的拟议测算模型和测算网(GAINA)的性能通过应用变动网络,而不是完全相连的层来更好地捕捉测数据的相关性,从GOUR(G)的测算方法中,我们所需要的数据流的性能的性能的演算到更精确的演化方法。另一个的演化方法的演化方法的演化方法的演化方法,我们所需的性能的演化方法的演化方法的演化方法是研究的另一种方法,我们研究的演化方法的演化方法的演化方法的演化方法的演化方法的演化方法的另一种方法的另一种方法,通过另一个的演化方法的演化方法的演算。另一个的演化方法的演算。