Over the last decades many of our traditionally analog assets have become digital: documents, photographs, music, scientific data, medical data, intelligence data, and so on. One critical result is that we now have huge volumes of digital information assets that should remain accessible, usable and undamaged over long periods of time - longer than the lifetime of any particular storage system or storage technology. We need to make this possible at very low cost, or we have no chance of preserving enough of the materials that will be important to us in the future. Here are two digital preservation projects to which I contribute my time and efforts.
The
Pharaoh Project
In the Pharaoh Project we are designing and implementing an archival storage system based on three key ideas
Here are a couple of papers that describe some of the motivation for and progress of the project:
LOCKSS stands for Lots of Copies Keep Stuff Safe. LOCKSS is open-source software that turns an ordinary PC into a digital preservation appliance, preserving the integrity of and access to online materials. A LOCKSS appliance is a node in a peer-to-peer network that implements a very robust protocol for auditing and repairing the online content. Libraries around the world (on six continents!) have joined the LOCKSS network to preserve the electronically published materials to which they subscribe. The project originated at and is still headquartered at Stanford University Libraries. David Rosenthal and Vicky Reich approached my research group while I still worked at Stanford with the possibility of collaborating on the design of a new LOCKSS protocol that would be highly scalable and highly resistant in the face of very powerful attacks attempting to damage, delete or impede access to the preserved materials. Such attacks are unfortunately a real problem for online repositories. The LOCKSS engineering team is currently rolling out the resulting protocol across the deployed LOCKSS network. For detailed information about the goals, technical challenges and our solutions, please see these papers: