The well-known trace reconstruction problem is the problem of inferring an unknown source string $x \in \{0,1\}^n$ from independent "traces", i.e. copies of $x$ that have been corrupted by a $\delta$-deletion channel which independently deletes each bit of $x$ with probability $\delta$ and concatenates the surviving bits. The current paper considers the extreme data-limited regime in which only a single trace is provided to the reconstruction algorithm. In this setting exact reconstruction is of course impossible, and the question is to what accuracy the source string $x$ can be approximately reconstructed. We give a detailed study of this question, providing algorithms and lower bounds for the high, intermediate, and low deletion rate regimes in both the worst-case ($x$ is arbitrary) and average-case ($x$ is drawn uniformly from $\{0,1\}^n$) models. In several cases the lower bounds we establish are matched by computationally efficient algorithms that we provide. We highlight our results for the high deletion rate regime: roughly speaking, they show that - Having access to a single trace is already quite useful for worst-case trace reconstruction: an efficient algorithm can perform much more accurate reconstruction, given one trace that is even only a few bits long, than it could given no traces at all. But in contrast, - in the average-case setting, having access to a single trace is provably not very useful: no algorithm, computationally efficient or otherwise, can achieve significantly higher accuracy given one trace that is $o(n)$ bits long than it could with no traces.
翻译:众所周知的追踪重建问题是从独立的“ traces” 中推断出一个未知的源字符串 $x $x $@ 0. 1 $n 的问题,即从独立的“ traces” 中推断出一个未知的源字符串 $x $x $x $ $ $ 0. 1 $ $ 0. 1 n 美元, 即 美元被一个 $delta$ Exetion 频道腐蚀的美元x 美元副本, 该频道独立删除每位美元美元美元, 概率 $delta $ 美元, 并且将幸存的位位元数混杂在一起。 本文认为, 极端的数据限制制度中只为重建算得上一个单一的字符串。 在这样的情况下, 精确的重建轨迹与我们给出的计算方法相匹配。 我们强调我们高删除率制度的结果: 大致说来, 美元是无法长久地说, 它们提供算法的算法, 以最差的运算法 。