This paper proposes a new method that combines check-pointing methods with error-controlled lossy compression for large-scale high-performance Full-Waveform Inversion (FWI), an inverse problem commonly used in geophysical exploration. This combination can significantly reduce data movement, allowing a reduction in run time as well as peak memory. In the Exascale computing era, frequent data transfer (e.g., memory bandwidth, PCIe bandwidth for GPUs, or network) is the performance bottleneck rather than the peak FLOPS of the processing unit. Like many other adjoint-based optimization problems, FWI is costly in terms of the number of floating-point operations, large memory footprint during backpropagation, and data transfer overheads. Past work for adjoint methods has developed checkpointing methods that reduce the peak memory requirements during backpropagation at the cost of additional floating-point computations. Combining this traditional checkpointing with error-controlled lossy compression, we explore the three-way tradeoff between memory, precision, and time to solution. We investigate how approximation errors introduced by lossy compression of the forward solution impact the objective function gradient and final inverted solution. Empirical results from these numerical experiments indicate that high lossy-compression rates (compression factors ranging up to 100) have a relatively minor impact on convergence rates and the quality of the final solution.
翻译:本文提出一种新的方法,将检查方法与大规模高性能全维变换(FWI)的低压压缩(FWI)相结合,这是一个在地球物理勘探中常用的逆向问题。这种结合可以大大减少数据流动,减少运行时间和峰值记忆。在Exscale计算时代,频繁的数据传输(例如记忆带宽、GPU或网络的PCIe带宽)是处理单位的性能瓶颈,而不是峰值FLOPS。与许多其他基于联合的优化操作问题一样,FWI在浮动点操作的数量、反向调整过程中的内存足和数据传输间接费用方面成本高昂。过去关于联合方法的工作已经发展了检查方法,以额外的浮点计算成本降低回溯的高峰记忆要求。将这种传统的检查与错误控制的损失压缩结合起来,我们探索存储、精确和时间到解决方案之间的三方面偏差。我们调查的是,由于前方解决方案的缩减而带来的近似错误是如何通过前方解决方案的压缩在反向回调过程中产生的大量内压率影响了这些最后的内压率和最后的数值实验结果。