Factorized Databases (FDBs) and the recently introduced Path Multiset Representations (PMRs) both aim at compactly representing results of database queries, and are quite different at first sight. FDBs were developed for the relational database model and represent finite sets of tuples, all of which have the same length. PMRs, on the other hand, were developed for the graph database model and represent possibly infinite multisets of variable-length paths. In this paper, we connect both representations to a common framework that is rooted in formal language theory. In particular, we show why FDBs are a special case of context-free grammars, which allows us to generalize FDBs beyond the standard setting of database relations. Taking into account that PMRs and finite automata are closely connected, this opens up a wide range of questions about tradeoffs between their respective size and the efficiency of query-plan operations on automata/grammar based representations. As a first step, we present here first results on size trade-offs between fundamental variants of automata-based and grammar-based compact representations.
翻译:暂无翻译