Random access DNA memory using Boolean search in an archival file storage system

submited by
Style Pass
2024-11-01 06:00:03

Author Contributions Statement. J.L.B., T.R.S., and M.B. designed the file labeling and selection scheme. J.L.B, T.R.S., and C.M.A. implemented the file selection scheme using FAS. J.B. and T.R.S. developed the encoding scheme and metadata tagging of the images to DNA. T.R.S. designed the plasmid for encoding imaging. H.H. and T.R.S. performed the cloning, transformation, and purification of the plasmids. J.L.B. synthesized and purified all the TAMRA and AFDye 647-labelled DNA oligonucleotides. J.L.B. characterized the particles. J.L.B. developed the synthetic route to attach DNA barcodes on the surface of the particles. J.L.B. performed the encapsulation, barcoding, sorting, reverse encapsulation of the particles after sorting, and desalting. T.R.S., H.H., and M.R. performed the sequencing. J.B. performed computational validation of the orthogonality of barcode sequences and J.L.B. performed the experimental validation of the orthogonality of barcode and probe sequences. J.B. developed the computational workflow to analyze the sequencing data, including statistical analyses. M.B. conceived of the file system and supervised the entire project. P.C.B. supervised the FAS selection and supervised the sequencing workflow. All authors analyzed the data and equally contributed to the writing of the manuscript.

DNA is an ultra-high-density storage medium that could meet exponentially growing worldwide demand for archival data storage if DNA synthesis costs declined sufficiently and random access of files within exabyte-to-yottabyte-scale DNA data pools were feasible. Here we demonstrate a path to overcome the second barrier by encapsulating data-encoding DNA file sequences within impervious silica capsules that are surface-labeled with single-stranded DNA barcodes. Barcodes are chosen to represent file metadata, enabling selection of sets of files with Boolean logic directly, without use of amplification. We demonstrate random access of image files from a prototypical 2 KB image database using fluorescence sorting with selection sensitivity of 1 in 106 files, which thereby enables 1 in 106N selection capability using N optical channels. Our strategy thereby offers a scalable concept for random-access of archival files in large-scale molecular datasets.

Leave a Comment