The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval. Previous attempts are mainly supervised and focus on similarity task, which estimates closeness between intervals. We desire to build informative representations without using supervised (labelled) data. One of the possible approaches is self-supervised learning (SSL). In contrast to the supervised paradigm, this one requires little or no labels for the data. Nowadays, most SSL approaches are either contrastive or non-contrastive. Contrastive methods make representations of similar (positive) objects closer and distancing different (negative) ones. Due to possible wrong marking of positive and negative pairs, these methods can provide an inferior performance. Non-contrastive methods don't rely on such labelling and are widespread in computer vision. They learn using only pairs of similar objects that are easier to identify in logging data. We are the first to introduce non-contrastive SSL for well-logging data. In particular, we exploit Bootstrap Your Own Latent (BYOL) and Barlow Twins methods that avoid using negative pairs and focus only on matching positive pairs. The crucial part of these methods is an augmentation strategy. Our augmentation strategies and adaption of BYOL and Barlow Twins together allow us to achieve superior quality on clusterization and mostly the best performance on different classification tasks. Our results prove the usefulness of the proposed non-contrastive self-supervised approaches for representation learning and interval similarity in particular.
翻译:摘要:油气行业中的表示学习问题旨在基于测井数据为井段构建表示模型。先前的尝试主要是有监督的,并侧重于相似性任务,该任务估计井段之间的相似度。我们希望在不使用有标签数据的情况下构建信息丰富的表示。可能的方法之一是自我监督学习(SSL)。与有监督范式相反,这需要很少或没有数据标签。现在,大多数SSL方法是对比或非对比的。对比方法使相似(正面)对象的表示更接近,使不同(负面)对象的表示更远离。由于可能错误标记正面和负面对,这些方法可能提供低效的性能。非对比方法不依赖此类标记,在计算机视觉中广泛使用。它们只使用容易在测井数据中识别的相似对象来学习。我们是首先引入非对比自我监督学习用于测井数据。特别是,我们利用Bootstrap Your Own Latent(BYOL)和Barlow Twins 方法,这些方法避免使用负面对,仅关注匹配正面对。这些方法的关键部分是增强策略。我们的增强策略和BYOL和Barlow Twins的适应性结合起来,使我们能够在不同的分类任务上取得卓越的质量表现,特别是在聚类上取得了最佳表现。我们的结果证明了所提出的非对比自我监督方法在表示学习和区间相似度方面的有用性。