Data oriented applications, usually written in a high-level, general-purpose programming language (such as Java) interact with database through a coarse interface. Informally, the text of a query is built on the application side (either via plain string concatenation or through an abstract notion of statement) and shipped to the database over the wire where it is executed. The results are then serialized and sent back to the "client-code" where they are translated in the language's native datatypes. This round trip is detrimental to performances but, worse, such a programming model prevents one from having richer queries, namely queries containing user-defined functions (that is functions defined by the programmer and used e.g. in the filter condition of a SQL query). While some databases also possess a "server-side" language (e.g. PL/SQL in Oracle database), its integration with the very-optimized query execution engine is still minimal and queries containing (PL/SQL) user-defined functions remain notoriously inefficient. In this setting, we reviewed existing language-integrated query frameworks, highlighting that existing database query languages (including SQL) share high-level querying primitives (e.g., filtering, joins, aggregation) that can be represented by operators, but differ widely regarding the semantics of their expression language. In order to represent queries in an application language- and database-agnostic manner, we designed a small calculus, dubbed "QIR" for Query Intermediate Representation. QIR contains expressions, corresponding to a small extension of the pure lambda-calculus, and operators to represent usual querying primitives. In the effort to send efficient queries to the database, we abstracted the idea of "good" query representations in a measure on QIR terms. Then, we designed an evaluation strategy rewriting QIR query representations into "better" ones.
翻译:以数据为导向的应用程序, 通常以高层次的通用编程语言( 如 Java) 编写, 通常以高层次的通用编程语言( 如 Java) 与数据库互动 。 非正式地说, 查询的文本建在应用程序的侧面( 要么通过平坦的字符串拼接, 要么通过抽象的语句概念), 并被发送到执行的线条上的数据库。 其结果随后被序列化并发回“ 客户代码 ”, 以语言的本地数据类型翻译。 这次往返访问不利于业绩, 但更糟糕的是, 这种编程模式阻止人们获得更丰富的查询, 即包含用户定义功能的查询( 由程序员定义的功能, 并在 SQL 查询中使用 。 虽然有些数据库的“ 服务器- 侧面码” 语言“, 代表着智能的“ 服务器”, 直径直径“ 直径”, 直径, 直到 直径“ 直径”, 直径“ 直径” 直径”, 直径“ 直径”, 直径“ 直径“ 直“, 直径”, 直“ 直” 直”, 直“ 直” 直” 直”, 直“ 直“ 直“,,, 直“,,,,,,,,, 直“ 直径直“ 直, 直,,,,,,,, 直径直,, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直,,,,,,,,,,,,,,,,,,,,,,,,,, 直,,, 直,,,,,,,,,,,,,,,,,,,,,,,,,,