Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores rely on data replication. However, providing replication brings with it the issue of consistency. Given that data are replicated in multiple geographically distributed data centers, and to meet the increasing requirements of distributed applications, many cloud data stores adopt eventual consistency and therefore allow to run data intensive operations under low latency. This comes at the cost of data staleness. In this paper, we prioritize data replication based on a set of flexible data semantics that can best suit all types of Big Data applications, avoiding overloading both network and systems during large periods of disconnection or partitions in the network. Therefore we integrated these data semantics into the core architecture of a well-known NoSQL data store (e.g., HBase), which leverages a three-dimensional vector-field model (regarding timeliness, number of pending updates and divergence bounds) to provision data selectively in an on-demand fashion to applications. This enhances the former consistency model by providing a number of required levels of consistency to different applications such as, social networks or e-commerce sites, where priority of updates also differ. In addition, our implementation of the model into HBase allows updates to be tagged and grouped atomically in logical batches, akin to transactions, ensuring atomic changes and correctness of updates as they are propagated.
翻译:最近出现了云计算,这是向个人和公司提供远程计算和储存基础设施的关键技术。为了实现大量可用但绩效高的服务,云层数据储存依靠数据复制。然而,提供复制带来了一致性问题。鉴于数据在多个地理分布的数据中心复制,并为了满足分布式应用的日益高的要求,许多云层数据储存最终采用了一致性,从而允许在低潜值下运行数据密集操作。这是以数据停滞为代价的。在本文中,我们优先考虑基于一套最适合所有类型的大数据应用程序的灵活数据语义复制数据,避免网络和系统在网络断裂或隔断的大时期超载。因此,我们将这些数据结构复制到多个地理分布式数据中心的核心结构中,利用一个三维矢量的矢量模型(即及时性、待更新的更新数量和差异约束)来提供有选择性的数据。这通过提供以前的一致性模型,在网络断裂或隔断或隔断期间避免超载,因此,我们将这些数据纳入一个众所周知的诺斯卡尔数据储存库(例如,HBase)数据库的核心结构更新,从而能够实现不同程度的顺序更新,从而实现不同水平的顺序更新。