The suffix array is arguably one of the most important data structures in sequence analysis and consequently there is a multitude of suffix sorting algorithms. However, to this date the \texttt{GSACA} algorithm introduced in 2015 is the only known non-recursive linear-time suffix array construction algorithm (SACA). Despite its interesting theoretical properties, there has been little effort in improving the algorithm's subpar real-world performance. There is a super-linear algorithm \texttt{DSH} which relies on the same sorting principle and is faster than \texttt{DivSufSort}, the fastest SACA for over a decade. This paper is concerned with analysing the sorting principle used in \texttt{GSACA} and \texttt{DSH} and exploiting its properties in order to give an optimised linear-time algorithm. Our resulting algorithm is not only significantly faster than \texttt{GSACA} but also outperforms\texttt{DivSufSort} and \texttt{DSH}.
翻译:后缀阵列可以说是序列分析中最重要的数据结构之一, 因此有多种后缀排序算法。 但是, 至今为止, 2015 年引入的\ texttt{ GSACA} 算法是唯一已知的非累进线性后缀阵列构建算法( SACA) 。 尽管它具有有趣的理论属性, 但它在改进算法的亚par真实世界性能方面没有做出多大努力。 一个超级线性算法 \ textt{ DSH} 依赖相同的排序原则, 并且比 10年来最快的 SACA 算法(\ textt{ DivSufSort} ) 更快。 本文涉及分析 \ textt{ GSACA} 和\ textt{H} 所使用的排序原则, 并利用其属性来提供优化线性线性算算法。 我们产生的算算算法不仅大大快于\ textt{ GSA} 而且还超越了\ texttratt{ DovSort} 和\\\\\ tDSDST} 。