This paper studies reconstruction of strings based upon their substrings spectrum. Under this paradigm, it is assumed that all substrings of some fixed length are received and the goal is to reconstruct the string. While many existing works assumed that substrings are received error free, we follow in this paper the noisy setup of this problem that was first studied by Gabrys and Milenkovic. The goal of this study is twofold. First we study the setup in which not all substrings in the multispectrum are received, and then we focus on the case where the read substrings are not error free. In each case we provide specific code constructions of strings that their reconstruction is guaranteed even in the presence of failure in either model. We present efficient encoding and decoding maps and analyze the cardinality of the code constructions, while studying the cases where the rates of our codes approach 1.
翻译:本文研究基于子字符串频谱的字符串重建。 在这一范式下, 假设收到了所有固定长度的子字符串, 目标是重建字符串。 虽然许多现有工程假设子字符串是免费的, 但我们在本文中关注Gabrys和Milenkovic首次研究的这个问题的吵闹设置。 本研究的目标是双重的。 首先, 我们研究并非多频谱中所有子字符串都得到的设置, 然后我们关注读子字符串并非无误的情况。 在每一种情况下, 我们提供具体的代码结构, 即使在两种模式都失败的情况下, 也保证其重建。 我们展示高效的编码和解码地图, 分析代码构造的基本性, 同时研究我们代码率接近1的案例 。