Data management becomes increasingly important in dealing with the large amounts of data produced by today's large-scale scientific simulations and experimental instrumentation. Multi-grid compression algorithms provide a promising way to manage scientific data at scale, but are not tailored for performance and reduction quality. In this paper, we optimize a multi-grid based algorithm to achieve high-performance and high-quality error-controlled lossy compression. Our contributions are three-fold. 1) We quantize the multi-grid coefficients in a level-wise fashion, and leverage multi-grid decomposition as a preconditioner instead of standalone compressors to improve compression ratios. 2) We optimize the performance of multi-grid decomposition/recomposition with a series of techniques from both system-level and algorithm-level. 3) We evaluate our proposed method using four real-world scientific datasets and compare with several state-of-the-art lossy compressors. Experiments demonstrate that our optimizations improve the decomposition/recomposition performance of the existing multi-grid based approach by up to 70X, and the proposed compression method can improve compression ratio by up to 2X compared with the second best error-controlled lossy compressors under the same distortion.
翻译:处理当今大规模科学模拟和实验仪器产生的大量数据时,数据管理变得日益重要。多电网压缩算法提供了在规模上管理科学数据的有希望的方法,但并非根据性能和降级质量定制的。在本文件中,我们优化了基于多网的算法,以实现高性能和高质量的错误控制损失压缩。我们的贡献是三重的。 1)我们用水平方法量化了多电网系数,并利用多电网拆解作为先决条件,而不是利用独立压缩机改进压缩比率。 2)我们优化了多电网拆解/再组合的性能,利用系统一级和算法一级的一系列技术进行优化。3)我们利用四个现实世界的科学数据集来评估我们拟议的方法,并与若干最先进的损失压缩压缩压缩压缩机进行比较。实验表明,我们的优化可以提高现有多电网方法的分解/再组合性能,可以提高到70X,而拟议的压缩法可以提高压缩率,在2X次最佳损失控制器下,与第二最佳变压器相比,将压缩率提高到2X。