Investment in measuring a process more completely or accurately is only useful if these improvements can be utilised during modelling and inference. We consider how improvements to data quality over time can be incorporated when selecting a modelling threshold and in the subsequent inference of an extreme value analysis. Motivated by earthquake catalogues, we consider variable data quality in the form of rounded and incompletely observed data. We develop an approach to select a time-varying modelling threshold that makes best use of the available data, accounting for uncertainty in the magnitude model and for the rounding of observations. We show the benefits of the proposed approach on simulated data and apply the method to a catalogue of earthquakes induced by gas extraction in the Netherlands. This more than doubles the usable catalogue size and greatly increases the precision of high magnitude quantile estimates. This has important consequences for the design and cost of earthquake defences. For the first time, we find compelling data-driven evidence against the applicability of the Gutenberg-Richer law to these earthquakes. Furthermore, our approach to automated threshold selection appears to have much potential for generic applications of extreme value methods.
翻译:只有在建模和推论期间能够利用这些改进,才能更完整或更准确地衡量一个过程的投资。我们考虑在选择建模阈值和随后推断极端价值分析时如何纳入数据质量的改进。我们受地震目录的驱使,以四舍五入和不完全观察的数据的形式考虑可变数据质量。我们开发了一种方法,选择一个时间变化的建模阈值,最佳利用现有数据,考虑到规模模型和观测四舍五入的不确定性。我们展示了模拟数据的拟议方法的效益,并将这种方法应用于荷兰天然气开采引起的地震目录。这比可用目录尺寸的两倍还多,大大提高了高数值估计的精确度。这对地震防御的设计及其成本有着重要影响。我们第一次发现令人信服的数据驱动证据,反对Gutenberg-Richer法律对这些地震的适用性。此外,我们自动选取阈值阈值方法似乎在极端价值方法的通用应用方面具有很大潜力。