Efficient error-controlled lossy compressors are becoming critical to the success of today's large-scale scientific applications because of the ever-increasing volume of data produced by the applications. In the past decade, many lossless and lossy compressors have been developed with distinct design principles for different scientific datasets in largely diverse scientific domains. In order to support researchers and users assessing and comparing compressors in a fair and convenient way, we establish a standard compression assessment benchmark -- Scientific Data Reduction Benchmark (SDRBench). SDRBench contains a vast variety of real-world scientific datasets across different domains, summarizes several critical compression quality evaluation metrics, and integrates many state-of-the-art lossy and lossless compressors. We demonstrate evaluation results using SDRBench and summarize six valuable takeaways that are helpful to the in-depth understanding of lossy compressors.
翻译:由于应用产生的数据数量不断增加,有效控制错误的压缩机对于当今大规模科学应用的成功至关重要。在过去十年中,许多无损和亏损压缩机的开发都具有在基本不同的科学领域不同科学数据集的不同设计原则。为了支持研究人员和用户以公平和方便的方式评估和比较压缩机,我们制定了标准压缩评估基准 -- -- 科学数据减少基准(SDRBench )。SDRBench包含着不同领域的大量真实世界科学数据集,汇总了若干关键的压缩质量评估指标,并整合了许多最先进的损失和无损压缩机。我们用SDRBench和六个有价值的外卖来展示评估结果,有助于深入了解损失压缩机。