Data is replicated and stored redundantly over multiple servers for availability in distributed databases. We focus on databases with frequent reads and writes, where both read and write latencies are important. This is in contrast to databases designed primarily for either read or write applications. Redundancy has contrasting effects on read and write latency. Read latency can be reduced by potential parallel access from multiple servers, whereas write latency increases as a larger number of replicas have to be updated. We quantify this tradeoff between read and write latency as a function of redundancy, and provide a closed-form approximation when the request arrival is Poisson and the service is memoryless. We empirically show that this approximation is tight across all ranges of system parameters. Thus, we provide guidelines for redundancy selection in distributed databases.
翻译:在分布式数据库中,数据被复制和储存,在多个服务器上重复和储存。我们注重经常读写的数据库,读写迟误很重要。这与主要为读写应用程序或写写应用程序而设计的数据库不同。裁员对读写延缓期产生了对比效应。阅读延缓期可以通过多个服务器的潜在平行访问而减少,而随着大量复制的复制品必须更新,写延缓期的增加则会减少。我们将读写延缓期之间的这种权衡作为冗余的函数加以量化,并在请求到达时提供封闭式近似值,Poisson是用户,而服务是没有记忆的。我们从经验上表明,这种近似时间在所有系统参数范围上都很紧紧。因此,我们为分布式数据库中的冗缓选择提供了指南。