This is a guest post by Ondřej Čertík. Ondřej formerly worked at Los Alamos National Labs and now works for GSI Technologies. He is known in the P

Archiving data on paper

submited by
Style Pass
2024-05-21 00:30:03

This is a guest post by Ondřej Čertík. Ondřej formerly worked at Los Alamos National Labs and now works for GSI Technologies. He is known in the Python community for starting the SymPy project and in the Fortran community for starting LFortran. — John

I finally got to experiment a bit with archiving data on a regular A4 or US Letter page using a regular printer and a phone camera to read it. It’s been bothering me for about 10 years. What is the maximum data size that we can store on the paper and reliably retrieve?

It seems it is limited by my camera, not my printer. This image stores 30KB of data. I printed it, took a photo with my iPhone, and then decoded and unpacked it using tar and bzip2. It correctly unpacks into 360KB of C++ files, 6305 lines.

It uses Golay codes (used by Voyager) that can fix 3 bad bits in each 24-bit code word (which contains 12-bit payload, and 12-bit parity bits). If there are 4 bad bits, the errors are detected, but cannot be corrected. In my experiment, there were 6 codewords which had 3 bad bits, and no codewords with more bad bits, so there was no data loss. It seems most of the bad bits were in the area where my phone cast a shadow on the paper, so possibly retaking the picture in broad daylight might help.

Leave a Comment