Preserving The Internet... and Everything Else

Yes, maybe it is just the world’s largest and most open hard drive, but nobody else is doing this important work that I know of.

Really, the Internet Archive is trying to do and doing a great job, but they are only a part of the picture. Maybe the single largest part, but what has actually been going on for the past decade or so, is that many (or should I say “most”) countries with the Internet presence, have been archiving websites of their own countries insofar as they can, as part of preserving their culture.

It’s called Web harvesting and it is has been a whole field of study in information science for quite some time, dealing not only with physical preservation, but also logical one (changing formats, emerging new formats to archive, multimedia, etc.).

The task is frequently associated with the National Library of the country (Library of Congress has a bit similar role in the US), because their task is usually to preserve the cultural heritage of the nation – all the printed books, magazines, newspapers and other public materials. Preserving the public web is just an extension of this.

Unfortunately, the task is such an enormous one that most of the countries are very selective about which websites they preserve. In this sense Internet Archive actually is quite unique, since it is non-discriminating. On the other hand, as far as my country is concerned, IA has been preserving about what… 1% of the websites here, I believe.

But yes, long-term digital preservation has been the topic in libraries (especially national ones), information sciences and computer sciences for quite some time by now, and there are lots of interesting issues at stake. For example, if we compare this sort of “national” or “global” memory with human memory, we must not forget(forgive the pun) the fact that it is quite important for humans to be able to forget things sometimes. In national and global terms this could also be rephrased that maybe the information is only to be remembered while there is somebody to whom it is important enough to preserve, so he does it himself.

A very broad topic, anyway, but for the list of web archiving, you can also check http://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives and related info.

Great post! I didn’t know their office is right in SF! I wonder if they’re open for public. It does look like a church (because it was) and it’s hilarious.

For three or four days Computer Magazines collection is not available.

The item is not available due to issues with the item’s content.

If you would like to report this problem as an error report,
you may do so here.

Anyone now where is the problem and when archive will be back?

http://archive.org/details/computermagazines