Given its status as a classic problem and its importance to both theoreticians and practitioners, edit distance provides an excellent lens through which to understand how the theoretical analysis of algorithms impacts practical implementations. From an applied perspective, the goals of theoretical analysis are to predict the empirical performance of an algorithm and to serve as a yardstick to design novel algorithms that perform well in practice. In this paper, we systematically survey the types of theoretical analysis techniques that have been applied to edit distance and evaluate the extent to which each one has achieved these two goals. These techniques include traditional worst-case analysis, worst-case analysis parametrized by edit distance or entropy or compressibility, average-case analysis, semi-random models, and advice-based models. We find that the track record is mixed. On one hand, two algorithms widely used in practice have been born out of theoretical analysis and their empirical performance is captured well by theoretical predictions. On the other hand, all the algorithms developed using theoretical analysis as a yardstick since then have not had any practical relevance. We conclude by discussing the remaining open problems and how they can be tackled.
翻译:鉴于其作为典型问题的地位及其对理论学家和从业者的重要性,编辑距离提供了一个极好的透镜,通过这个透镜可以理解算法的理论分析如何影响实际执行。从应用的角度,理论分析的目标是预测算法的经验性表现,并用作设计实际运作良好的新算法的准绳。在本文中,我们系统地调查了用于编辑距离和评价每一方实现这两个目标的程度的理论分析技术的类型。这些技术包括传统的最坏情况分析,最坏情况分析通过编辑距离或迷你或可压缩性、平均情况分析、半随机模型和基于建议的模式而实现。我们发现,跟踪记录是喜忧参半的。一方面,实践中广泛使用的两种算法产生于理论分析,其经验性表现被理论预测很好地捕捉。另一方面,自那时起,所有使用理论分析作为衡量标准制定的算法都没有任何实际意义。我们通过讨论其余的公开问题和如何解决它们来得出结论。