The trace reconstruction problem studies the number of noisy samples needed to recover an unknown string $\boldsymbol{x}\in\{0,1\}^n$ with high probability, where the samples are independently obtained by passing $\boldsymbol{x}$ through a random deletion channel with deletion probability $p$. The problem is receiving significant attention recently due to its applications in DNA sequencing and DNA storage. Yet, there is still an exponential gap between upper and lower bounds for the trace reconstruction problem. In this paper we study the trace reconstruction problem when $\boldsymbol{x}$ is confined to an edit distance ball of radius $k$, which is essentially equivalent to distinguishing two strings with edit distance at most $k$. It is shown that $n^{O(k)}$ samples suffice to achieve this task with high probability.
翻译:追踪重建问题研究为回收一个未知的字符串$\boldsymbol{x ⁇ }in ⁇ 0,1 ⁇ n$(概率高)而需要的噪音样本数量,在这些样本中,通过一个随机删除的频道,以删除概率$\boldsypsol{x}$(概率高)独立获取,删除概率为$/boldsymbol{x}$(概率高),这个问题最近因其在DNA排序和DNA储存方面的应用而受到极大关注。然而,对于追踪重建问题,上下界之间仍然存在着指数差距。在本文中,我们研究追踪重建问题,因为$\boldsymbol{x}$(概率高)被限制在半径的编辑距离球上,这基本上相当于将两个字符串与编辑距离区别在最多为$k$( 美元) 。 事实显示, $n*O(k)$(k) 样本非常有可能完成这项任务。