Numerical simulations are ubiquitous in science and engineering. Machine learning for science investigates how artificial neural architectures can learn from these simulations to speed up scientific discovery and engineering processes. Most of these architectures are trained in a supervised manner. They require tremendous amounts of data from simulations that are slow to generate and memory greedy. In this article, we present our ongoing work to design a training framework that alleviates those bottlenecks. It generates data in parallel with the training process. Such simultaneity induces a bias in the data available during the training. We present a strategy to mitigate this bias with a memory buffer. We test our framework on the multi-parametric Lorenz's attractor. We show the benefit of our framework compared to offline training and the success of our data bias mitigation strategy to capture the complex chaotic dynamics of the system.
翻译:数字模拟在科学和工程方面是无处不在的。 用于科学研究的机器学习研究人工神经结构如何从这些模拟中学习,以加速科学发现和工程过程。 这些结构大多是经过监督培训的。 它们需要大量来自模拟的数据,而模拟数据生成缓慢,记忆贪婪。 在本篇文章中,我们介绍我们正在进行的设计培训框架的工作,以减轻这些瓶颈。它与培训过程同时生成数据。这种同时生成数据。在培训过程中,这种同时生成数据。这种同时性在数据中产生偏差。我们用记忆缓冲器提出减少这种偏差的战略。我们测试我们关于多参数Lorenz的吸引器的框架。我们展示了我们框架与离线培训相比的好处,以及我们数据偏差减缓战略的成功,以捕捉系统的复杂混乱动态。