Uncertainty quantification in Artificial Intelligence (AI)-based predictions of material properties is of immense importance for the success and reliability of AI applications in material science. While confidence intervals are commonly reported for machine learning (ML) models, prediction intervals, i.e., the evaluation of the uncertainty on each prediction, are seldomly available. In this work we compare 3 different approaches to obtain such individual uncertainty, testing them on 12 ML-physical properties. Specifically, we investigated using the Quantile loss function, machine learning the prediction intervals directly and using Gaussian Processes. We identify each approachs advantages and disadvantages and end up slightly favoring the modeling of the individual uncertainties directly, as it is the easiest to fit and, in most cases, minimizes over-and under-estimation of the predicted errors. All data for training and testing were taken from the publicly available JARVIS-DFT database, and the codes developed for computing the prediction intervals are available through JARVIS-Tools.
翻译:人工智能(AI)对物质特性的不确定性的预测对于AI在材料科学方面的应用的成功和可靠性极为重要。虽然通常报告机器学习模型有信任间隔,但预测间隔(即对每项预测的不确定性的评估)很少。在这项工作中,我们比较了三种不同的方法,以获得这种个人不确定性,测试12 ML物理特性。具体地说,我们利用量值损失功能进行了调查,机器直接学习预测间隔,并使用Gaussian过程。我们确定了每一种方法的利弊,并直接略为偏向于个人不确定性的模型,因为它最容易适应,而且在大多数情况下,尽量减少预测错误的过度和低估。所有用于培训和测试的数据都取自公开提供的JARVIS-DFT数据库,为计算预测间隔而开发的代码可以通过JARVIS-Tools获得。