Overview
The Data-PASS partners, with Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) funding, have built a prototype storage platform for policy-driven distributed replication of digital holdings. The partners have received funding from the Institute of Museum and Library Services to develop this prototype into a self-contained system that can be installed, used, and maintained by institutional staff without technical expertise. This will result in a set of widely disseminated open source tools that can easily be used by libraries, museums, and archives that wish to collaborate in replicating their own content.
How it will work
This prototype system is built around a core LOCKSS network. Participating institutions will expose content through the OAI-PMH protocol or through the Dataverse Network (DVN) digital library system. Institutions participating in the network will then choose which of their own and which of the other partners' content to replicate by creating policies (or rules), which will be formalized in a machine-readable schema. Since partners vary in size and technology, the policy commitments will be able to scale to the participants' resources.
The complete public holdings of each partner, including metadata, data, documentation, and legal agreements, will be replicated by the network. When new collections are added to the preservation network, the system will provide a way to automatically identify collaborating peers with the required resources, and initiates regular harvesting by those peers. Previous versions of the replicated content will be maintained, as well. Replicated copies will be geographically and institutionally distributed, which guards against technical and organizational preservation failures.
Content in the LOCKSS network will be audited regularly to demonstrate conformance with preservation requirements. While each partner will be trusted to hold others' public content and to not disseminate content improperly, no partner will be trusted to have "super-user" rights. Trust will be verified through automated audits of trusted repository requirements, which will provide the reliability of a top-down replication system with the resilience of a peer-to-peer model.
Planned Releases
Fall 2010
- Extensions to the Dataverse Network System to allow any dataverse owner to expose selected content for replication harvesting.
- Version 1.0 of the SAFE-Archive policy schema and complete documentation on using the schema with TRAC.
Winter 2010
- Version 1.0 of the SAFE-Archive system, supporting installation, policy configuration, and monitoring of LOCKSS networks.
Spring-Summer 2011
- Version 1.1 of the SAFE-Archive system, supporting auto-reconfiguration of LOCKSS networks to reflect changes to policy schema.
- Online courses and written guides.
More information
For a more detailed overview see:
Altman, M., Beecher, B., Crabtree, J., Andreev, L., Bachman, E., Buchbinder, A., Burling, S., King, P., & Maynard, M. (2009). A Prototype Platform for Policy-Based Archival Replication. Against the Grain 21(2).
For current code and documentation, see: http://safearchive.sourceforge.net/.
For other presentations and publications, see: http://www.icpsr.umich.edu/DATAPASS/presentations.jsp.







