Graph database engines stand out in the era of big data for their efficiency of modeling and processing linked data. There is a strong need of testing graph database engines. However, random testing, the most practical way of automated test generation, faces the challenges of semantic validity, non-empty result, and behavior diversity to detect bugs in graph database engines. To address these challenges, in this paper, we propose GDsmith, the first black-box approach for testing graph database engines. It ensures that each randomly generated Cypher query satisfies the semantic requirements via skeleton generation and completion. GDsmith includes our technique to increase the probability of producing Cypher queries that return non-empty results by leveraging three types of structural mutation strategies. GDsmith also includes our technique to improve the behavior diversity of the generated Cypher queries by selecting property keys according to their previous frequencies when generating new queries. Our evaluation results demonstrate that GDsmith is effective and efficient for automated query generation and substantially outperforms the baseline. GDsmith successfully detects 27 previously unknown bugs on the released versions of three popular open-source graph database engines and receive positive feedback from their developers.
翻译:在大数据时代,图形数据库引擎因其建模和处理链接数据的效率而处于显赫的时代。 非常需要测试图形数据库引擎。然而,随机测试是自动测试生成的最实用方法,它面临着语义有效性、非空白结果和行为多样性的挑战,以探测图形数据库引擎中的错误。为了应对这些挑战,我们在此文件中提议GDsmith,这是测试图形数据库引擎的第一个黑箱方法。它确保随机生成的每份Cypher查询都通过骨架的生成和完成满足语义学要求。GDsmith包括我们的技术,通过利用三种结构突变战略来增加生成非空结果的Cypher查询的可能性。GDsmith还包括我们提高生成密码查询行为多样性的技术,在生成新查询时根据以前的频率选择属性密钥。我们的评价结果表明,GDsmith对于自动生成查询是有效和高效的,而且大大超出基线。GDsmith成功地检测了三个广受欢迎的开源数据库引擎发布版本的27个以前未知的错误,并从开发者那里获得积极的反馈。