Graph similarity search algorithms usually leverage the structural properties of a database. Hence, these algorithms are effective only on some structural variations of the data and are ineffective on other forms, which makes them hard to use. Ideally, one would like to design a data analytics algorithm that is structurally robust, i.e., it returns essentially the same accurate results over all possible structural variations of a dataset. We propose a novel approach to create a structurally robust similarity search algorithm over graph databases. We leverage the classic insight in the database literature that schematic variations are caused by having constraints in the database. We then present RelSim algorithm which is provably structurally robust under these variations. Our empirical studies show that our proposed algorithms are structurally robust while being efficient and as effective as or more effective than the state-of-the-art similarity search algorithms.
翻译:图形相似搜索算法通常会利用数据库的结构特性。 因此, 这些算法只有在数据的某些结构变异上有效,而且在其它形式上也无效,因此很难使用。 理想的情况是,人们希望设计一种结构健全的数据分析算法,即它返回与数据集所有可能的结构变异基本上相同的准确结果。 我们提出一种新的方法,在图形数据库上建立一个结构强大的类似搜索算法。 我们利用数据库文献中的经典洞察力,即由于数据库中存在限制而导致的结构变异。 然后我们提出在这些变异下结构上具有可辨别性强力的RelSim算法。 我们的经验研究表明,我们提议的算法在结构上是稳健的,同时与最先进的类似搜索算法一样高效和有效或更有效。