Machine learning-based modeling of physical systems has experienced increased interest in recent years. Despite some impressive progress, there is still a lack of benchmarks for Scientific ML that are easy to use but still challenging and representative of a wide range of problems. We introduce PDEBench, a benchmark suite of time-dependent simulation tasks based on Partial Differential Equations (PDEs). PDEBench comprises both code and data to benchmark the performance of novel machine learning models against both classical numerical simulations and machine learning baselines. Our proposed set of benchmark problems contribute the following unique features: (1) A much wider range of PDEs compared to existing benchmarks, ranging from relatively common examples to more realistic and difficult problems; (2) much larger ready-to-use datasets compared to prior work, comprising multiple simulation runs across a larger number of initial and boundary conditions and PDE parameters; (3) more extensible source codes with user-friendly APIs for data generation and baseline results with popular machine learning models (FNO, U-Net, PINN, Gradient-Based Inverse Method). PDEBench allows researchers to extend the benchmark freely for their own purposes using a standardized API and to compare the performance of new models to existing baseline methods. We also propose new evaluation metrics with the aim to provide a more holistic understanding of learning methods in the context of Scientific ML. With those metrics we identify tasks which are challenging for recent ML methods and propose these tasks as future challenges for the community. The code is available at https://github.com/pdebench/PDEBench.
翻译:尽管取得了一些令人印象深刻的进展,但仍然缺乏科学ML的基准,这些基准很容易使用,但仍然具有挑战性和代表一系列广泛的问题。我们引入了PDEBench,这是一套基于部分差异的基于时间的模拟任务基准套件。PDEBench由代码和数据组成,用经典数字模拟和机器学习基线衡量新机器学习模型的性能。我们提出的一套基准问题促成了以下独特的特点:(1) 与现有基准相比,各种PDE的范围要大得多,从相对常见的例子到比较现实和困难的问题;(2) 与以前的工作相比,现用数据集要大得多,包括许多初始和边界条件和PDE参数的多重模拟任务;(3) 用方便用户的API和基准结果的扩展源代码,用流行的机器学习模型(FNO、U-Net、PINN、Gradient-Inversion Comproductions)衡量新机器学习模型的性能。 PDEBench让研究人员可以自由地为自身的目的提出基准范围,而我们又用标准化的MAPR 和M 目标的新的研究方法来比较我们现有的标准。