The Paper Data Storage Option

Wonderful! In 100 years your great-grandkids can find your paper backups of your never-published memoirs in the attic! And if they’re very lucky they’ll remember something about you using this PaperBack program to generate them. Now all they have to do is load the program on their quantum computers!

Let’s see, there’s an old CD of that program here, but its data layer degraded before their parents were born. Oh good, there’s also a paper backup of the program… in PaperBack format.

Hm. Maybe your PaperBacked memoirs can be hung on the wall as antique art instead…

“Oh good, there’s also a paper backup of the program… in PaperBack format.”

BWAAAAAAAAAAAAAAHAHAHAHAHAHAHAHAH. Ive been there xD.

Yet in 50 years no one will know how to read the 50 yo so called compressed paper alphabet and we will be back to square one.

That’s actually really interesting, I’m going to look for a Linux compatible paper storage solution.

Google realized this and have launched GMail Paper.

http://mail.google.com/mail/help/paper/more.html

I just signed up. Can’t wait for that first ream to come in!

I’m guessing you could probably dump an ASCII version of the encoding algorithm onto a couple of sides of A4

I’ve been interested in paper-based backup solutions for a while. I find it interesting that we can still use punched card and tape from 40 years ago without any problem, assuming we have working readers, but most original magnetic storage media has degraded so much that we can’t read it at all.

One other thought with this solution. I agree that we have to be careful what format the documents are stored in. We also need to make sure that we don’t assume 8-bit encoding, or ASCII, or anything else that could so easily be overlooked, that we take for granted.

Paper also has the advantage that humans can use nothing but their eyes to determine the degradation of their backup. You can look at a sheet of paper and say “This is still very good, it can easily last a few years longer” or “Wow, this is about to fade, we better renew the backup”. With a CD or DVD, you can’t even tell with your eyes whether it’s still good or already broken.

I think you’re way out where you shouldn’t be. Isn’t this a case of programmers being so much programmers they get things backwards now?

Actually, this is something that national libraries are bound to look into.

In most countries a national library is charged to preserve all books, newspapers and magazines that have been published in the country, forever (In the US the Library of Congress fulfills a similar role, as far as I know).

Nowadays this has been expanded to include digital materials – websites that are located in the country and suchlike. Also, in order to increase the usability, there are programs in virtually any national library, to digitize existing paper books.

There are obviously many problems dealing with ever increasing volume of data, even more so in digital form, where your best bet for web harvesting, for example, can only be to carefully select and harvest some snapshots of what you deem relevant of your country’s web.

However, for national libraries, the problem of storing digital for many hundreds of years – which is nothing unheard of for books, indeed – is particularly expensive to solve.

Preserving paper is something libraries are good at already, so something that lessens the upkeep (particularly energy) costs for preserving digitized materials is quite welcome.

So how much paper would it take to back up… the internet?

Jeff celebrates April 1st 4 months later in August 1st.

May not be so obvious, but it’s got to be a joke.

Same problems today as they were yesterday, which result in trying to go back to old solutions. PaperBack is probably just a very conscious ironic statement.

Particularly elaborate it seems, as it apparently also demonstrates how some people are willing to be misguided by all the lights, taut it as cool and useful, and ignore such important concepts as Green Technology, or even realize the storage capacity is not even close to be comparable to today’s data storage medium. 2 DVD-ROMs surface for instance, fit perfectly on top of a A4 page. Let’s be modest and assume only a DVD-4. That’s close to 10GBs as opposed to the A4 page 500Kb.

The problem is longevity. We have warehouses of tapes from the NASA apollo missions that are completely unreadable. What makes you think the DVD format is so special that it will still be around in 30 years? When’s the last time you saw an 8-inch floppy drive at COMP-USA? Paper has two things going for it that no digital technology can match:

So look at the facts:

  1. We know that paper can last a long time, under the right conditions, (and even under not so ideal conditions), because, well, we have some really old paper.
  2. Most of our digital technology is really new. The bits of digital technology that are not new, are completely useless nowadays to all but the most rare of experts on the technology, and the luck to get some of the old machines working. (which is not so straightforward, always, as they may need parts that are no longer manufactured)
  3. You toiled years on that program, or that data, or that novel, you want it to still be around in 30 years or what? Look at what’s still around from 30 years ago, and make your choice.

Replication of data is the answer - not paper. Replication in format across multiple distribution mediums, down through the years, is the real long term storage strategy we must be moving towards.

Paper didn’t really exist as a feasible way to keep information prior to the printing press. Today we can store impossible amounts of data in cheap and effectively ways that would even impress dear old Gutenberg.

We need to look at improving the quality of how we electronically save data so that it may be archived effectively for a long period of time. Keeping the data ‘live’ i.e. (on active medium which is current) seems to be the standard approach.

The answers are already appearing with cloud computing and other services that can be used for this type of operation. The company is paid to provide a service and it is the business of the company to transfer data that is archaic in an obsolete file format to a usable later version based on the ongoing importance of the information i.e. someone who is willing to pay.

These are the modern day printers, busily making copies of information that is still deemed important to someone, somewhere for a fee.

Jeff’s cute idea is irrelevant and outdated. Sure, look to the past for many things to help guide us in the future (e.g. morals, principles, lessons of past failure, strategies for governance etc) but not technology. We haven’t reached the computing end game yet and going back to paper is a complete cop out.

Sorry, got my wires crossed a bit.

paper’s got two things going for it: Longevity, and Visibility. You can see what’s printed on the paper. You can’t see the orientation of the iron atoms glued to a disc of plastic, or on the surface of a platter, or in the burnt away bits of coating on the surface of a ‘cd-r’, which does not even have the advantage of being pressed in metal as a manufactured CD or DVD has.

@vince. Tell that to the thousands of geocities customers. I really have absolutely no faith in the longevity of my data in the cloud, and what you suggest in terms of replication is really like relying on a dead man’s switch to prevent the launch of a nuclear missile.

We need passive systems that can retain data without intervention. There’s really a finite amount of time that we can devote to retaining data in this way that you suggest. We can probably even come up with some mathematical formulation, charting the growth in the amount of data which needs to be preservered, against the amount of time it will take to replicate it. How quickly will the former outpace the latter?

I’m not sure what is meant with longevity. Modern DVDs produced with super cyanide (Tayto’s) or metal-stabilized Cyanine (TDK’s) have lifespans of approximately 70 years. Far larger than the lifespan of the technologies used to read these formats. Even the 30 years lifespan of cheap DVDs is an almost guarantee of a larger lifespan than that of their format support.

Similarly, the paperback solution faces a high media longevity, but no promise of media support. In fact, it presents the huge problem of being also dependent on the encoder/decoder availability in 100 years, probably demanding a legacy system just for the purposes of restore.

And then there is also the huge archiving problem it presents. Let’s not be shy here, you would need 8,000 A4 pages to match the capacity of a single 4GB DVD. And then you then need to ask, how long will it take to scan 8,000 pages so I can have my restore?

A marginal gain in longevity does not immediately make all the obvious problems, irrelevant. So, I keep sustaining this is just a joke.

Right at the top of the page it says “Olly, the author of OllyDbg, presents his new open source joke”

Maybe everyone is taking it too seriously?

I don’t get the point of all of this… And no, I don’t think our alphabet to be inefficent, it’s simply tuned for fast loading. Have “phun” reading data saved that way!