Large-scale finite element simulations of complex physical systems governed by partial differential equations (PDE) crucially depend on adaptive mesh refinement (AMR) to allocate computational budget to regions where higher resolution is required. Existing scalable AMR methods make heuristic refinement decisions based on instantaneous error estimation and thus do not aim for long-term optimality over an entire simulation. We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning (RL) to train refinement policies directly from simulation. AMR poses a new problem for RL as both the state dimension and available action set changes at every step, which we solve by proposing new policy architectures with differing generality and inductive bias. The model sizes of these policy architectures are independent of the mesh size and hence can be deployed on larger simulations than those used at train time. We demonstrate in comprehensive experiments on static function estimation and time-dependent equations that RL policies can be trained on problems without using ground truth solutions, are competitive with a widely-used error estimator, and generalize to larger, more complex, and unseen test problems.
翻译:对受部分差异方程式(PDE)制约的复杂物理系统进行大型有限元素模拟,这关键地取决于适应性网目改进(AMR),以便向需要更高分辨率的区域分配计算预算。现有的可扩缩的AMR方法根据瞬时误差估计作出超常性改进决定,因此不以整个模拟的长期最佳性为目的。我们提议将AMR作为Markov的决策过程提出新颖的提法,并应用深度强化学习(RL)来直接从模拟中培训完善政策。AMR给RL带来了一个新问题,因为国家层面和现有行动每一步都设定了变化,我们通过提出具有不同一般性和感性且带有偏向性的新政策结构加以解决。这些政策结构的模型大小独立于网目大小,因此可以部署在比列车时使用的更大的模拟中。我们在关于静态函数估计和时间依赖的方程式的全面实验中证明,RL政策可以在不使用地面真相解决方案的情况下就问题进行培训,具有竞争力,与广泛使用的错误估计器具有竞争力,并且一般化为更大、更复杂和看不见的试验问题。