The main obstacles for the practical deployment of DNA-based data storage platforms are the prohibitively high cost of synthetic DNA and the large number of errors introduced during synthesis. In particular, synthetic DNA products contain both individual oligo (fragment) symbol errors as well as missing DNA oligo errors, with rates that exceed those of modern storage systems by orders of magnitude. These errors can be corrected either through the use of a large number of redundant oligos or through cycles of writing, reading, and rewriting of information that eliminate the errors. Both approaches add to the overall storage cost and are hence undesirable. Here we propose the first method for storing quantized images in DNA that uses signal processing and machine learning techniques to deal with error and cost issues without resorting to the use of redundant oligos or rewriting. Our methods rely on decoupling the RGB channels of images, performing specialized quantization and compression on the individual color channels, and using new discoloration detection and image inpainting techniques. We demonstrate the performance of our approach experimentally on a collection of movie posters stored in DNA.
翻译:实际部署基于DNA的数据储存平台的主要障碍是合成DNA的成本高得令人望而却步,合成DNA产品在合成过程中出现大量错误,特别是合成DNA产品既含有单个寡头(分块)符号错误,也含有缺失的DNA寡头(分块)错误,其比率在数量上超过了现代储存系统,这些错误可以通过使用大量多余的寡头或消除错误的信息的书写、阅读和重写周期加以纠正。这两种方法都增加了总体储存成本,因而不可取。我们在这里提出了第一个在DNA中储存量化图像的方法,即使用信号处理和机器学习技术处理错误和成本问题,而不必使用多余的寡头或重新写作。我们的方法是拆分RGB图像的渠道,对单个的彩色频道进行专门的定量和压缩,并使用新的脱色检测和涂漆技术。我们用DNA中储存的电影海报的收藏实验性表现。