Data management applications store their data using structured files in which data are usually sorted to serve indexing and queries. In order to insert or remove a record in a sorted file, the positions of existing data need to be shifted. To this end, the existing data after the insertion or removal point must be rewritten to admit the change in place, which can be unaffordable for applications that make frequent updates. As a result, applications often employ extra layers of indirections to admit changes out-of-place. However, it causes increased access costs and excessive complexity. This paper presents a novel file abstraction, FlexFile, that provides a flexible file address space where in-place updates of arbitrary-sized data, such as insertions and removals, can be performed efficiently. With FlexFile, applications can manage their data in a linear file address space with minimal complexity. Extensive evaluation results show that a simple key-value store built on top of this abstraction can achieve high performance for both reads and writes.
翻译:数据管理应用程序使用结构化文件存储数据,这些文件通常对数据进行分类以提供索引和查询。为了在分类文件中插入或删除记录,需要改变现有数据的位置。为此,必须重新写入插入点或删除点之后的现有数据,以承认已经发生的变化,这些变化对于经常更新的应用程序来说是负担不起的。因此,应用程序往往使用额外的间接层来接受异处的变化。然而,这增加了访问成本和过于复杂。本文展示了一个新的文件抽象化,即FlexFile,它提供了一个灵活的文件地址空间,可以有效地进行任意大小的数据(例如插入和清除)的现场更新。有了FlexFile,应用程序可以在线性文件地址空间管理其数据,其复杂性最小。广泛的评价结果显示,在抽象中外建的简单关键值存储可以为阅读和写作带来很高的性能。