通过很少注意机制强有力地表述油井间隔 (Robust representations of oil wells' intervals via sparse attention mechanism)

Determining the characteristics of newly drilled wells (e.g. reservoir formation properties) is a major challenge. One of the corresponding tasks is a well-interval similarity assessment: if we can learn to predict which oilfields are rich and which are not by comparing them with existing ones, this will lead to significant cost reductions. There are three main requirements for applying machine learning to oil&gas data: high quality even for unreliable data, low manual effort and interpretability of the model itself. Neural networks can be used to address these challenges. The use of a self-supervised paradigm leads to automatic model construction. However, existing approaches lack interpretability, and their quality prevents their use in applications. In particular, existing approaches like LSTM suffer from short-term memory, paying more attention to the end of a sequence. Instead, neural networks with Transformer architecture cast their attention over all sequences to make a decision. To make them more efficient in terms of computational time and more robust to noisy or absent values, we introduce a limited attention mechanism similar to that of the Informer architecture that considers only top correspondences. We run experiments on an open dataset with more than $20$ wells, making our experiments reliable and suitable for industrial use. The best results were obtained with our adaptation of the Informer variant of Transformer with ROC AUC $0.982$. It outperforms classical approaches with ROC AUC $0.824$, recurrent neural networks (RNNs) with ROC AUC $0.934$ and the direct use of Transformer with ROC AUC $0.961$. We show that well-interval representations obtained by Informer are of higher quality than those extracted by RNNs. Moreover, the obtained attention is interpretable, as it corresponds to the importance of a particular part of an interval for the similarity estimation.

翻译：确定新钻井的特性(例如储油层形成特性)是一项重大挑战。相应的任务之一是进行一个相互密切的相似性评估:如果我们能够学会预测哪些油田丰富,哪些油田不与现有油田比较,这将大幅降低成本。应用机器学习石油和天然气数据有三个主要要求:即使数据不可靠,人工劳动和模型本身的可解释性也低,质量也很高。神经网络可以用来应对这些挑战。使用自我监督的内建模式导致自动模型的构建。然而,现有的方法缺乏可解释性,而且其质量无法在应用中加以应用。特别是,LSTM等现有方法存在短期记忆,对序列的结束给予更多关注。相反,具有变压结构的神经网络将注意力放在所有序列上,以便做出决策。为了提高计算时间的效率,使之更稳健和坚固,我们引入了类似于内建结构的类似机制,仅考虑最高金额的内建模型。我们用ORC9.9的内建模型进行实验,以最可靠的内建的内建模型进行比20美元的变现。