Motivated by applications in polymer-based data storage we introduced the new problem of characterizing the code rate and designing constant-weight binary $B_2$-sequences. Binary $B_2$-sequences are collections of binary strings of length $n$ with the property that the real-valued sums of all distinct pairs of strings are distinct. In addition to this defining property, constant-weight binary $B_2$-sequences also satisfy the constraint that each string has a fixed, relatively small weight $\omega$ that scales linearly with $n$. The constant-weight constraint ensures low-cost synthesis and uniform processing of the data readout via tandem mass spectrometers. Our main results include upper bounds on the size of the codes formulated as entropy-optimization problems and constructive lower bounds based on Sidon sequences.
翻译:源于聚合物基数据存储应用的需求,我们提出了一种特性为恒定重量的二元$B_2$序列的编码率和设计问题。二元$B_2$序列是一个由长度为$n$的二进制串组成的集合,满足不同二进制串之和互不相同。恒定重量的二元$B_2$序列还满足一个限制,即每个字符串的重量为$\omega$,且与$n$成线性关系。这个限制保证了数据的低成本合成和通过串联质谱仪实现的均匀处理。我们的主要结果包括以熵优化问题为基础的上界和基于Sidon序列的下界构造。