使用ParaGrapher选择性并行加载大规模压缩图 (Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher)

Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This shows a demand for high-performance libraries capable of loading graphs to (i) accelerate designing new graph algorithms, (ii) to evaluate the contributions on a wide range of graph algorithms, and (iii) to facilitate easy and fast comparison over different graph frameworks. To that end, we present ParaGrapher, a high-performance API and library for loading large-scale and compressed graphs. ParaGrapher supports different types of requests for accessing graphs in shared- and distributed-memory and out-of-core graph processing. We explain the design of ParaGrapher and present a performance model of graph decompression, which is used for evaluation of ParaGrapher over three storage types. Our evaluation shows that by decompressing compressed graphs in WebGraph format, ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution in comparison to the binary and textual formats. ParaGrapher is available online on https://blogs.qub.ac.uk/DIPSA/ParaGrapher/.

翻译：全面评估是实验科学的基础之一。在高性能图处理领域，通过在不同框架上支持通用输入格式，对研究贡献进行彻底评估变得更加可行。然而，每个框架都创建其特定格式，这些格式可能无法支持读取大规模真实世界图数据集。这表明需要能够加载图的高性能库，以（i）加速新图算法的设计，（ii）在广泛的图算法上评估贡献，以及（iii）促进不同图框架间简便快速的比较。为此，我们提出了ParaGrapher，一个用于加载大规模压缩图的高性能API和库。ParaGrapher支持在共享内存、分布式内存以及核外图处理中访问图的不同类型请求。我们阐述了ParaGrapher的设计，并提出了图解压缩的性能模型，该模型用于在三种存储类型上评估ParaGrapher。我们的评估表明，通过解压缩WebGraph格式的压缩图，与二进制和文本格式相比，ParaGrapher在加载速度上最高可提升3.2倍，在端到端执行速度上最高可提升5.2倍。ParaGrapher可通过https://blogs.qub.ac.uk/DIPSA/ParaGrapher/在线获取。