The DNA storage channel is considered, in which a codeword is comprised of $M$ unordered DNA molecules. At reading time, $N$ molecules are sampled with replacement, and then each molecule is sequenced. A coded-index concatenated-coding scheme is considered, in which the $m$th molecule of the codeword is restricted to a subset of all possible molecules (an inner code), which is unique for each $m$. The decoder has low-complexity, and is based on first decoding each molecule separately (the inner code), and then decoding the sequence of molecules (an outer code). Only mild assumptions are made on the sequencing channel, in the form of the existence of an inner code and decoder with vanishing error. The error probability of a random code as well as an expurgated code is analyzed and shown to decay exponentially with $N$. This establishes the importance of increasing the coverage depth $N/M$ in order to obtain low error probability.
翻译:DNA存储通道由DNA存储通道考虑, 其代码由未排序的DNA分子组成, 读取时, 以替换方式对分子进行抽样, 然后对每个分子进行顺序排序。 考虑的是一个编码- 索引共编编码方案, 代号中第一组分子限用美元, 其范围是所有可能的分子的子集( 内部代码), 每一分子都是独一无二的。 解码器的复杂度低, 其基础是首先对每个分子进行解码( 内部代码), 然后对分子的序列进行解码( 外部代码) 。 在测序频道上仅作出轻微的假设, 其形式是存在一个内部代码和解码, 并带有消失错误。 随机代码的误差概率以及一个导出代码的误差概率被分析并显示为以美元加速衰减。 这确定了增加深度$/ M 的重要性, 以便获得低误差概率 。