Motivated by the sequence reconstruction problem initiated by Levenshtein, reconstruction codes were introduced by Cai \emph{et al}. to combat errors when a fixed number of noisy channels are available. The central problem on this topic is to design codes with sizes as large as possible, such that every codeword can be uniquely reconstructed from any $N$ distinct noisy reads, where $N$ is fixed. In this paper, we study binary reconstruction codes with the constraint that every codeword is balanced, which is a common requirement in the technique of DNA-based storage. For all possible channels with a single edit error and their variants, we design asymptotically optimal balanced reconstruction codes for all $N$, and show that the number of their redundant symbols decreases from $\frac{3}{2}\log_2 n+O(1)$ to $\frac{1}{2}\log_2n+\log_2\log_2n+O(1)$, and finally to $\frac{1}{2}\log_2n+O(1)$ but with different speeds, where $n$ is the length of the code. Compared with the unbalanced case, our results imply that the balanced property does not reduce the rate of the reconstruction code in the corresponding codebook.
翻译:受Levenshtein 启动的序列重建问题的驱动, Cai emph{et al} 引入了重建代码,以在固定数量的噪音频道出现错误时消除错误。 本专题的中心问题是设计大小尽可能大的代码, 以便每个代码都可以在任何不同的杂音读数中独有地重建, 其中固定了美元。 在本文中, 我们研究二进制重建代码, 限制每个代码是平衡的, 这是基于DNA储存技术中的一项共同要求。 对于所有可能存在单一编辑错误的频道及其变异, 我们为所有N$设计了尽可能最佳平衡的重建代码, 并显示其冗余符号的数量从$frac{3 ⁇ 2 ⁇ 2 ⁇ 2 n+O(1) 到$frac{1 ⁇ 2 ⁇ 2 ⁇ 2n_2n+O(1) 。 最后我们研究了每个代码平衡的二进制代码, 这是基于DNA存储技术的一个共同要求。 对于所有可能存在错误的频道, 我们设计了最优化的平衡的重建代码, 但速度不同, 显示其多余符号的数量比我们的代码的长度要慢。