The performance of database systems is usually characterised by their average-case (i.e., throughput) behaviour in standardised or de-facto standard benchmarks like TPC-X or YCSB. While tails of the latency (i.e., response time) distribution receive considerably less attention, they have been identified as a threat to the overall system performance: In large-scale systems, even a fraction of requests delayed can build up into delays perceivable by end users. To eradicate large tail latencies from database systems, the ability to faithfully record them, and likewise pinpoint them to the root causes, is imminently required. In this paper, we address the challenge of measuring tail latencies using standard benchmarks, and identify subtle perils and pitfalls. In particular, we demonstrate how Java-based benchmarking approaches can substantially distort tail latency observations, and discuss how the discovery of such problems is inhibited by the common focus on throughput performance. We make a case for purposefully re-designing database benchmarking harnesses based on these observations to arrive at faithful characterisations of database performance from multiple important angles.
翻译:数据库系统的性能通常以其平均情况(即吞吐量)在标准标准基准(如TPC-X或YCSB)中以标准化或反facto标准基准(如TPC-X或YCSB)为特征。虽然悬浮分布的尾部(即反应时间)得到的关注要少得多,但被确定为对整个系统性能的威胁:在大型系统中,即使是一小部分被延误的请求也会累积成最终用户可以察觉的延误。为了消除数据库系统中的大型尾部延时,迫切需要具备忠实记录它们并同样将其确定为根本原因的能力。在本文件中,我们处理使用标准基准衡量尾部延时的挑战,并找出微妙的危险和陷阱。特别是,我们展示以爪哇为基础的基准方法如何严重扭曲尾部延时观察,并讨论这些问题的发现如何因共同注重吞吐量性能而受阻。我们有理由根据这些观察,特意重新指定数据库基准,以便从多个重要角度得出数据库性能的准确性能。