用于预测液压碎后CBM油井产量的机器学习模型的可解释性和因果发现 (Interpretability and causal discovery of the machine learning models to predict the production of CBM wells after hydraulic fracturing)

Machine learning approaches are widely studied in the production prediction of CBM wells after hydraulic fracturing, but merely used in practice due to the low generalization ability and the lack of interpretability. A novel methodology is proposed in this article to discover the latent causality from observed data, which is aimed at finding an indirect way to interpret the machine learning results. Based on the theory of causal discovery, a causal graph is derived with explicit input, output, treatment and confounding variables. Then, SHAP is employed to analyze the influence of the factors on the production capability, which indirectly interprets the machine learning models. The proposed method can capture the underlying nonlinear relationship between the factors and the output, which remedies the limitation of the traditional machine learning routines based on the correlation analysis of factors. The experiment on the data of CBM shows that the detected relationship between the production and the geological/engineering factors by the presented method, is coincident with the actual physical mechanism. Meanwhile, compared with traditional methods, the interpretable machine learning models have better performance in forecasting production capability, averaging 20% improvement in accuracy.

翻译：在液压破碎后,在CBM油井的生产预测中广泛研究了机器学习方法,但实际上仅仅由于一般化能力低和缺乏可解释性而使用了这种方法。本条提出一种新的方法,从观察到的数据中发现潜在因果关系,目的是找到一种间接解释机器学习结果的方法。根据因果发现理论,通过明确的输入、输出、处理和混杂变量得出因果图。然后,SHAP用来分析各种因素对生产能力的影响,这些因素间接解释机器学习模型。拟议方法可以捕捉各种因素与产出之间的内在非线性关系,根据对各种因素的相互关系分析,弥补传统机器学习过程的局限性。CBM的实验表明,根据所介绍的方法所检测到的生产与地质/工程因素之间的关系与实际物理机制相吻合。与传统方法相比,可解释的机器学习模型在预测生产能力方面表现更好,平均提高20%的准确性。

相关内容

Machine Learning

关注 2241

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日