Anyone in need of a data system today is confronted with numerous complex options in terms of system architectures, such as traditional relational databases, NoSQL and NewSQL solutions as well as several sub-categories like column-stores, row-stores etc. This overwhelming array of choices makes bootstrapping data-driven applications difficult and time consuming, requiring expertise often not accessible due to cost issues (e.g., to scientific labs or small businesses). In this paper, we present the vision of evolutionary data systems that free systems architects and application designers from the complex, cumbersome and expensive process of designing and tuning specialized data system architectures that fit only a single, static application scenario. Setting up an evolutionary system is as simple as identifying the data. As new data and queries come in, the system automatically evolves so that its architecture matches the properties of the incoming workload at all times. Inspired by the theory of evolution, at any given point in time, an evolutionary system may employ multiple competing solutions down at the low level of database architectures -- characterized as combinations of data layouts, access methods and execution strategies. Over time, "the fittest wins" and becomes the dominant architecture until the environment (workload) changes. In our initial prototype, we demonstrate solutions that can seamlessly evolve (back and forth) between a key-value store and a column-store architecture in order to adapt to changing workloads.
翻译:今天需要数据系统的人在系统结构方面面临着许多复杂的选择,例如传统的关系数据库、NOSQL和NewSQL解决方案以及若干子类别,如专列储存、行储存等等。这种庞大的选择使得以数据驱动的布局应用程序很难和耗时,要求由于成本问题(例如科学实验室或小企业)往往无法获得专门知识。在本文中,我们介绍了进化数据系统的愿景,这些系统使系统设计师和应用设计师摆脱了设计和调整专门数据系统结构的复杂、繁琐和昂贵的过程,而专门数据系统结构只适合单一的静态应用设想。设置进化系统与识别数据一样简单。随着新的数据和查询的出现,系统自动演变,使其结构与随时随着工作量的演变而变化的特点相匹配。在任何特定时间的进化理论下,进化系统可能会在数据库结构的低层次上采用多种相互竞争的解决方案 -- 其特征是数据布局、访问方法和执行战略的组合。在时间上,“我们快速进化的建筑,直到我们逐渐地展示了我们的主要结构的进化和进化结构。”