Finding patterns in large highly connected datasets is critical for value discovery in business development and scientific research. This work focuses on the problem of subgraph matching on streaming graphs, which provides utility in a myriad of real-world applications ranging from social network analysis to cybersecurity. Each application poses a different set of control parameters, including the restrictions for a match, type of data stream, and search granularity. The problem-driven design of existing subgraph matching systems makes them challenging to apply for different problem domains. This paper presents Mnemonic, a programmable system that provides a high-level API and democratizes the development of a wide variety of subgraph matching solutions. Importantly, Mnemonic also delivers key data management capabilities and optimizations to support real-time processing on long-running, high-velocity multi-relational graph streams. The experiments demonstrate the versatility of Mnemonic, as it outperforms several state-of-the-art systems by up to two orders of magnitude.
翻译:在大型高度连接的数据集中查找模式对于商业发展和科学研究中的价值发现至关重要。 这项工作侧重于流图上的子图匹配问题, 它为从社交网络分析到网络安全等各种现实世界应用程序提供了实用性。 每个应用程序都提出了一套不同的控制参数, 包括对匹配的限制、 数据流的类型 和搜索颗粒。 由问题驱动的现有子图匹配系统的设计使得它们难以应用不同的问题域 。 本文展示了 Mnemonic, 这是一种提供高级 API 的可编程系统, 并使得多种子图匹配解决方案的开发民主化。 重要的是, Mnemonic 还提供关键的数据管理能力和优化, 以支持长运行、 高速多关系图流的实时处理。 实验显示了Mnemonic 的多功能性, 因为它在两个数量级上超越了多个状态的系统 。