As the scale of problems and data used for experimental design, signal processing and data assimilation grow, the oft-occuring least squares subproblems are correspondingly growing in size. As the scale of these least squares problems creates prohibitive memory movement costs for the usual incremental QR and Krylov-based algorithms, randomized least squares problems are garnering more attention. However, these randomized least squares solvers are difficult to integrate application algorithms as their uncertainty limits practical tracking of algorithmic progress and reliable stopping. Accordingly, in this work, we develop theoretically-rigorous, practical tools for quantifying the uncertainty of an important class of iterative randomized least squares algorithms, which we then use to track algorithmic progress and create a stopping condition. We demonstrate the effectiveness of our algorithm by solving a 0.78 TB least squares subproblem from the inner loop of incremental 4D-Var using only 195 MB of memory.
翻译:随着用于实验设计、信号处理和数据同化的问题和数据规模的扩大,最小平方子问题出现的最少平方子问题也在相应扩大。由于这些最小平方问题的规模使得通常的渐进式QR和Krylov算法的记忆移动成本高得令人望而却步,随机最小平方问题正日益引起更多的注意。然而,这些随机式最低平方解算法难以整合应用算法,因为它们的不确定性限制了对算法进展和可靠停止的实际跟踪。因此,在这项工作中,我们开发了理论上坚固的实用工具,用以量化一个重要的迭接随机式最低平方算法类别的不确定性,然后我们用这些工具跟踪算法进展并创造停止状态。我们用仅195 MB的记忆来从递增4D-Var的内循环中解决0.78 TB最低平方次问题,以证明我们的算法的有效性。