The set of all $ q $-ary strings that do not contain repeated substrings of length $ \leqslant\! 3 $ (i.e., that do not contain substrings of the form $ a a $, $ a b a b $, and $ a b c a b c $) constitutes a code correcting an arbitrary number of tandem-duplication mutations of length $ \leqslant\! 3 $. In other words, any two such strings are non-confusable in the sense that they cannot produce the same string while evolving under tandem duplications of length $ \leqslant\! 3 $. We demonstrate that this code is asymptotically optimal in terms of rate, meaning that it represents the largest set of non-confusable strings up to subexponential factors. This result settles the zero-error capacity problem for the last remaining case of tandem-duplication channels satisfying the "root-uniqueness" property.
翻译:3 $( 即不包含以美元、 b a b 美元和 a b a b c 美元为单位的子字符串) 构成一个代码, 纠正任意数目的双倍倍变异( 美元) 3 美元 。 换句话说, 任何两个这样的字符串都是不可配置的, 因为它们不能产生相同的字符串, 而同时在以美元为单位的双倍重复下演化 3 美元 。 我们证明, 就费率而言, 这个代码是最小的最佳, 意思是它代表了最大一组不可配置的字符串, 与子解变因素相加 3 。 这样可以解决满足“ root- uncilant\ ” 属性的剩余同步重复频道中最后一个情况下的0 eror 能力问题 。