In the Shortest Superstring problem, we are given a set of strings and we are asking for a common superstring, which has the minimum number of characters. The Shortest Superstring problem is NP-hard and several constant-factor approximation algorithms are known for it. Of particular interest is the GREEDY algorithm, which repeatedly merges two strings of maximum overlap until a single string remains. The GREEDY algorithm, being simpler than other well-performing approximation algorithms for this problem, has attracted attention since the 1980s and is commonly used in practical applications. Tarhio and Ukkonen (TCS 1988) conjectured that GREEDY gives a 2-approximation. In a seminal work, Blum, Jiang, Li, Tromp, and Yannakakis (STOC 1991) proved that the superstring computed by GREEDY is a 4-approximation, and this upper bound was improved to 3.5 by Kaplan and Shafrir (IPL 2005). We show that the approximation guarantee of GREEDY is at most $(13+\sqrt{57})/6 \approx 3.425$, making the first progress on this question since 2005. Furthermore, we prove that the Shortest Superstring can be approximated within a factor of $(37+\sqrt{57})/18\approx 2.475$, improving slightly upon the currently best $2\frac{11}{23}$-approximation algorithm by Mucha (SODA 2013).
翻译:在最短的超级字符串问题中,我们得到了一组字符串,我们要求有一个共同的超级字符串,该字符数最少。最短的超级字符串问题在于NP-hard 和一些常识因素近效算法。特别令人感兴趣的是GREEDY算法,它反复将两个最大重叠字符串合并到一个字符串留下。GREEDY算法比其他运行良好的近效算法简单得多,自1980年代以来,它吸引了人们的注意,并经常用于实际应用。Tarhio和Ukkkkkonen(TCS 1988)预测,GREEDY给出了2个最接近值的算法。在一项基本工作中,Blum、Jiang、Li、Tromp和Yannakakakis(STOC 1991)证明了GREEDY计算出的超值是4倍的。GREEDY算法比其他运行的近效近效算法更简单,从1980年代开始, 和Shafr(IPL) (2005年),我们发现GREDY的近似保证在2005年第一次(13_18美元) rassrass rass rass rass) ration。