生成合成时间-系列数据、应用于联合污水流预测的巴耶斯生成反反转网络(GAN) (A Bayesian Generative Adversarial Network (GAN) to Generate Synthetic Time-Series Data, Application in Combined Sewer Flow Prediction)

Despite various breakthroughs in machine learning and data analysis techniques for improving smart operation and management of urban water infrastructures, some key limitations obstruct this progress. Among these shortcomings, the absence of freely available data due to data privacy or high costs of data gathering and the nonexistence of adequate rare or extreme events in the available data plays a crucial role. Here, Generative Adversarial Networks (GANs) can help overcome these challenges. In machine learning, generative models are a class of methods capable of learning data distribution to generate artificial data. In this study, we developed a GAN model to generate synthetic time series to balance our limited recorded time series data and improve the accuracy of a data-driven model for combined sewer flow prediction. We considered the sewer system of a small town in Germany as the test case. Precipitation and inflow to the storage tanks are used for the Data-Driven model development. The aim is to predict the flow using precipitation data and examine the impact of data augmentation using synthetic data in model performance. Results show that GAN can successfully generate synthetic time series from real data distribution, which helps more accurate peak flow prediction. However, the model without data augmentation works better for dry weather prediction. Therefore, an ensemble model is suggested to combine the advantages of both models.

翻译：尽管在改进城市水基础设施的智能操作和管理的机器学习和数据分析技术方面取得了各种突破,但一些关键的限制因素阻碍了这一进展,其中包括:由于数据隐私或数据收集费用高昂,以及现有数据中不存在适当的稀有或极端事件,缺乏可自由获取的数据,这些缺点具有关键作用。这里,基因反转网络(GANs)可以帮助克服这些挑战。在机器学习中,基因化模型是能够学习数据传播以生成人工数据的一组方法。在本研究中,我们开发了一个GAN模型,以生成合成时间序列,以平衡我们有限记录的时间序列数据,并提高数据驱动模型的准确性,用于综合下水道流量预测。我们认为,德国一个小城镇的下水道系统是试验案例。在数据驱动模型开发过程中,使用热量和流入储油罐的情况可以帮助克服这些挑战。目的是利用降水数据预测流量,并利用模型性能合成数据来审查数据增强的影响。结果显示,GAN模型能够成功地从真实数据分布中生成合成时间序列,从而有助于更精确的峰流预测。但是,我们认为,德国的一个小城镇的下水道系统系统系统系统系统是更好的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

429+阅读 · 2021年1月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日