In-memory key-value stores provide consistent low-latency access to all objects which is important for interactive large-scale applications like social media networks or online graph analytics and also opens up new application areas. But, when storing the data in RAM on thousands of servers one has to consider server failures. Only a few in-memory key-value stores provide automatic online recovery of failed servers. The most prominent example of these systems is RAMCloud. Another system with sophisticated fault-tolerance mechanisms is DXRAM which is optimized for small data objects. In this report, we detail the remote replication process which is based on logs, investigate selection strategies for the reorganization of these logs and evaluate the reorganization performance for sequential, random, zipf and hot-and-cold distributions in DXRAM. This is also the first time DXRAM's backup system is evaluated with high speed I/O devices, specifically with 56 GBit/s InfiniBand interconnect and PCI-e SSDs. Furthermore, we discuss the copyset replica distribution to reduce the probability for data loss and the adaptations to the original approach for DXRAM.
翻译:模拟关键值仓库对所有对象提供持续的低延迟访问,这对于社交媒体网络或在线图解分析等交互式大型应用非常重要,并且打开新的应用程序区域。 但是,在将数据存储在数千个服务器上存储存储在存储存储器中的数据时,必须考虑服务器故障。 只有少数模拟关键值仓库可以自动在线恢复失灵服务器。这些系统最突出的例子是RAMCloud。另一个有复杂过错容忍机制的系统是DXRAM,这个系统对小数据对象来说是最佳的。我们在本报告中详细介绍了基于日志的远程复制进程,调查这些日志重组的选择策略,并评估DXRAM的顺序、随机、拉链和冷热分布的重组性能。这也是首次用高速I/O设备对DXRAM的备份系统进行评估,特别是用56 GBit/s Infiniband 互连和 PCI-e SDDSD设备。此外,我们讨论了复制复制的复制分发过程,以减少数据损失概率,并调整DRAMX的原始方法。