We study the impact of sub-array merging routines on merge-based sorting algorithms. More precisely, we focus on the galloping sub-routine that Timsort uses to merge monotonic (non-decreasing) sub-arrays, hereafter called runs, and on the impact on the number of element comparisons performed if one uses this sub-routine instead of a na\"ive merging routine. The efficiency of Timsort and of similar sorting algorithms has often been explained by using the notion of runs and the associated run-length entropy. Here, we focus on the related notion of dual runs, which was introduced in the 1990s, and the associated dual run-length entropy. We prove, for this complexity measure, results that are similar to those already known when considering standard run-induced measures: in particular, Timsort requires only O(n + n log({\sigma})) element comparisons to sort arrays of length n with {\sigma} distinct values. In order to do so, we introduce new notions of fast- and middle-growth for natural merge sorts (i.e., algorithms based on merging runs). By using these notions, we prove that several merge sorting algorithms, provided that they use Timsort's galloping sub-routine for merging runs, are as efficient as Timsort at sorting arrays with low run-induced or dual-run-induced complexities.
翻译:我们研究子阵列合并常规对基于合并的排序算法的影响。 更准确地说, 我们注重的是Timsort用来合并单调(非裁量)子阵列(以下称为运行)的相关亚阵列概念, 以及当一个人使用子阵列而不是“自动合并常规”时, 进行元素比较的次数。 Timsort 和类似排序算法的效率通常通过使用运行概念和相关的运行长度诱导值来解释。 在这里, 我们注重的是Timsort 用来合并单调(非裁量)子阵列(以下称为“运行 运行 运行 运行 ” ) 的相联的亚阵列规则。 对于这一复杂度,我们证明的结果与在考虑标准运行诱导措施时已知的结果相似: 特别是, Timsort只需要 O (n + n log (sigma) ) 元素对与低长度的阵列( trigma) 和相关的运行长变序值的比较。 为了如此, 我们引入了双向和中间递增的新概念, 以这些变序(我们先行) 提供这些变的变式, 以这些变算, 以这些变的变序为直成的变式, 。