As the scale of problems and data used for experimental design, signal processing and data assimilation grows, the oft-occuring least squares subproblems are correspondingly growing in size. As the scale of these least squares problems creates prohibitive memory movement costs for the usual incremental QR and Krylov-based algorithms, randomized least squares problems are garnering more attention. However, these randomized least squares solvers are difficult to integrate application algorithms as their uncertainty limits practical tracking of algorithmic progress and reliable stopping. Accordingly, in this work, we develop theoretically-rigorous, practical tools for quantifying the uncertainty of an important class of iterative randomized least squares algorithms, which we then use to track algorithmic progress and create a stopping condition. We demonstrate the effectiveness of our algorithm by solving a 0.78 TB least squares subproblem from the inner loop of incremental 4D-Var using only 195 MB of memory.
翻译:随着用于实验设计、信号处理和数据同化的问题和数据规模的扩大,最小平方子问题出现的最少平方子问题也在相应扩大。随着这些最小平方问题的规模为通常的递增QR和Krylov的算法造成令人望而却步的内存移动成本,随机最小平方问题正日益引起更多的注意。然而,这些随机最小平方解算法难以整合应用算法,因为它们的不确定性限制了对算法进展和可靠停止的实际跟踪。因此,在这项工作中,我们开发了理论上坚固的实用工具,用以量化一个重要的迭接随机最小平方算法类别的不确定性,然后我们用这些工具跟踪算法进展并创造停止状态。我们通过从递增4D-Var的内循环中用195 MB的记忆来解决0.78 TB最低平方子子子问题,证明了我们的算法的有效性。