We introduce PyParSVD\footnote{https://github.com/Romit-Maulik/PyParSVD}, a Python library that implements a streaming, distributed and randomized algorithm for the singular value decomposition. To demonstrate its effectiveness, we extract coherent structures from scientific data. Futhermore, we show weak scaling assessments on up to 256 nodes of the Theta machine at Argonne Leadership Computing Facility, demonstrating potential for large-scale data analyses of practical data sets.
翻译:我们引入了PyParSVD\ footnote{https://github.com/Romit-Maulik/PyParSVD},这是一家Python 图书馆,对单值分解采用流式、分布式和随机算法。为了展示其有效性,我们从科学数据中提取了一致的结构。Futhermore,我们对阿贡领导电子计算设施 Theta 机器多达256个节点的评估不力,显示了对实用数据集进行大规模数据分析的潜力。