We propose a new data structure, Parallel Adjacency Lists (PAL), for efficiently managing graphs with billions of edges on disk. The PAL structure is based on the graph storage model of GraphChi (Kyrola et. al., OSDI 2012), but we extend it to enable online database features such as queries and fast insertions. In addition, we extend the model with edge and vertex attributes. Compared to previous data structures, PAL can store graphs more compactly while allowing fast access to both the incoming and the outgoing edges of a vertex, without duplicating data. Based on PAL, we design a graph database management system, GraphChi-DB, which can also execute powerful analytical graph computation. We evaluate our design experimentally and demonstrate that GraphChi-DB achieves state-of-the-art performance on graphs that are much larger than the available memory. GraphChi-DB enables anyone with just a laptop or a PC to work with extremely large graphs.
翻译:我们提出一个新的数据结构,即平行相邻列表(PAL),以高效管理磁盘上数十亿边缘的图形。PAL结构以GreaphChi(Kyrola等人,OSDI)的图形存储模型为基础(Kyrola等人,2012年),但我们扩展了它,以允许查询和快速插入等在线数据库特征。此外,我们用边缘和顶端属性扩展模型。与以往的数据结构相比,PAL可以更紧紧地存储图形,同时允许快速访问顶端的进出边缘,而不必重复数据。基于 PAL,我们设计了一个图形数据库管理系统,GreagChi-DB, 也可以进行强大的分析图形计算。我们实验性地评估了我们的设计,并表明GagChi-DB在远大于现有内存的图形上达到最新艺术性能。GreagChi-DB让只有笔记本或个人电脑的人能够用极大图表工作。