NoSQL databases support semi-structured data, typically modeled as JSON. They also provide limited (but expanding) query languages. Their idiomatic, non-SQL language constructs, the many variations, and the lack of formal semantics inhibit deep understanding of the query languages, and also impede progress towards clean, powerful, declarative query languages. This paper specifies the syntax and semantics of SQL++, which is applicable to both JSON native stores and SQL databases. The SQL++ semi-structured data model is a superset of both JSON and the SQL data model. SQL++ offers powerful computational capabilities for processing semi-structured data akin to prior non-relational query languages, notably OQL and XQuery. Yet, SQL++ is SQL backwards compatible and is generalized towards JSON by introducing only a small number of query language extensions to SQL. Recognizing that a query language standard is probably premature for the fast evolving area of NoSQL databases, SQL++ includes configuration options that formally itemize the semantics variations that language designers may choose from. The options often pertain to the treatment of semi-structuredness (missing attributes, heterogeneous types, etc), where more than one sensible approaches are possible. SQL++ is unifying: By appropriate choices of configuration options, the SQL++ semantics can morph into the semantics of existing semi-structured database query languages. The extensive experimental validation shows how SQL and four semi-structured database query languages (MongoDB, Cassandra CQL, Couchbase N1QL and AsterixDB AQL) are formally described by appropriate settings of the configuration options. Early adoption signs of SQL++ are positive: Version 4 of Couchbase's N1QL is explained as syntactic sugar over SQL++. AsterixDB will soon support the full SQL++ and Apache Drill is in the process of aligning with SQL++.
翻译:NOSQL 数据库支持半结构化数据, 通常以 JSON 模式建模。 SQL+ 半结构化数据模型也提供有限的( 但正在扩大) 查询语言。 它们的语言、 非 SQL 语言结构、 许多变异和缺乏正式语义抑制了对查询语言的深度理解, 也阻碍了在清洁、 强大、 宣示性查询语言方面的进展。 此文件指定了 SQL++ 的语法和语义, 它适用于 JSON 本地仓库和 SQL 数据库。 SQL+ 半结构化数据模型是 JSON 和 SQL 数据模型的超级。 SQL+ 提供强大的计算能力, 处理类似于先前非关系查询语言的半结构化数据, 特别是 OQL 和 XQ 。 SQL+L 将使用少量的查询语言扩展。 SQQ 的查询标准对于快速演化区域来说可能是不成熟的, SQL 和 Q+L 的配置选项是正式的 SDL 。