In the Minimum Common String Partition Problem (MCSP), we are given two strings on input, and we want to partition both into the same collection of substrings, minimizing the number of the substrings in the partition. This combinatorial optimization problem has applications in computational biology and is NP-hard. Many different heuristic and exact methods exist for this problem, such as a Greedy approach, Ant Colony Optimization, or Integer Linear Programming. In this paper, we formulate the MCSP as a Dynamic Program and develop an exact solution algorithm based on Decision Diagrams for it. We also introduce a restricted Decision Diagram that allows to compute heuristic solutions to the MCSP and compare the quality of solution and runtime on instances from literature with existing approaches. Our approach scales well and is suitable for heuristic solution of large-scale instances.
翻译:在最小共同字符串分割问题( MSP ) 中,我们有两个输入字符串,我们想要将输入分为两个字符串,两个都分成一个子字符串,将分区中的子字符串数量减少到最小。这个组合优化问题在计算生物学中具有应用性,并且是硬的NP。对于这个问题,存在着许多不同的杂交和精确的方法,例如贪婪方法、Ant Colony 优化化或Integer线性编程。在本文中,我们将 MSP 设计成一个动态程序,并根据决策图图绘制一个精确的解算法。我们还引入了一个有限的决定图,以便能够对 MSP 进行超常解决方案的计算,比较解决方案的质量,并用现有方法对文献实例进行运行。我们的方法尺度很好,适合大规模实例的超常解算法。