Columnar databases are an established way to speed up online analytical processing (OLAP) queries. Nowadays, data processing (e.g., storage, visualization, and analytics) is often performed at the programming language level, hence it is desirable to also adopt columnar data structures for common language runtimes. While there are frameworks, libraries, and APIs to enable columnar data stores in programming languages, their integration into applications typically requires developer interference. In prior work, researchers implemented an approach for *automated* transformation of arrays into columnar arrays in the GraalVM JavaScript runtime. However, this approach suffers from performance issues on smaller workloads as well as on more complex nested data structures. We find that the key to optimizing accesses to columnar arrays is to identify queries and apply specific optimizations to them. In this paper, we describe novel compiler optimizations in the GraalVM Compiler that optimize queries on columnar arrays. At JIT compile time, we identify loops that access potentially columnar arrays and duplicate them in order to specifically optimize accesses to columnar arrays. Additionally, we describe a new approach for creating columnar arrays from arrays consisting of complex objects by performing **multi-level storage transformation**. We demonstrate our approach via an implementation for JavaScript `Date` objects. [ full abstract at https://doi.org/10.22152/programming-journal.org/2023/7/9 ]
翻译:列数据库是加快在线分析处理查询(OLAP)的既定方法。如今,数据处理(例如存储、可视化和分析分析)往往在编程语言一级进行,因此最好也采用通用语言运行时间的分栏数据结构。虽然有框架、图书馆和API, 以便能够用编程语言储存专栏数据,但将其纳入应用程序通常需要开发者干预。在以前的工作中,研究人员采用了将阵列自动* 转换成GraalVM JavaScript运行时的阵列阵列的方法。然而,这一方法存在较小工作量的性能问题,以及更复杂的嵌套数据结构。我们发现,优化分栏阵列阵列数据结构的关键是找出查询和对它们应用特定的优化。我们描述GraalVM 汇编器中的新编译器优化了对列阵列阵列的查询。在JIT编辑时间时,我们发现可能访问列阵列阵列阵列阵列并复制这些阵列的回路,以便具体优化D/20级阵列数据阵列的准入。我们用Rests 展示了“我们通过智能阵列”