In image denoising problems, the increasing density of available images makes an exhaustive visual inspection impossible and therefore automated methods based on machine-learning must be deployed for this purpose. This is particulary the case in seismic signal processing. Engineers/geophysicists have to deal with millions of seismic time series. Finding the sub-surface properties useful for the oil industry may take up to a year and is very costly in terms of computing/human resources. In particular, the data must go through different steps of noise attenuation. Each denoise step is then ideally followed by a quality control (QC) stage performed by means of human expertise. To learn a quality control classifier in a supervised manner, labeled training data must be available, but collecting the labels from human experts is extremely time-consuming. We therefore propose a novel active learning methodology to sequentially select the most relevant data, which are then given back to a human expert for labeling. Beyond the application in geophysics, the technique we promote in this paper, based on estimates of the local error and its uncertainty, is generic. Its performance is supported by strong empirical evidence, as illustrated by the numerical experiments presented in this article, where it is compared to alternative active learning strategies both on synthetic and real seismic datasets.
翻译:在图像去除问题中,现有图像密度的增加使得不可能为此目的进行彻底的视觉检查,因此,必须为此目的采用基于机器学习的自动化方法。这是地震信号处理方面特别突出的情况。工程师/地球物理学家必须处理数百万个地震时间序列。寻找对石油工业有用的地下特性可能要花一年时间,而且从计算/人力资源方面来说成本很高。特别是,数据必须经过不同步骤的噪音消减。然后,每个隐性步骤最好以人类专门知识的手段进行质量控制(QC)阶段。为了以监督的方式学习质量控制分类,必须提供标签的培训数据,但从人类专家那里收集标签是非常耗时的。因此,我们建议采用新的积极学习方法,按顺序选择最相关的数据,然后将这些数据交给人类专家进行分类。除了在地球物理应用之外,我们本文中推广的技术是通用的,根据对当地误差及其不确定性的估计进行质量控制(QC)阶段。它的性能得到强有力的实证证据的支持,正如本文章所展示的合成实验和实际地震试验所显示的替代方法。