Archival Storage

For 45 of ICPSR's 51 years, the Consortium kept copies of its data holdings in three off-site locations: One in a physical warehouse south of Ann Arbor, and two on magnetic tapes. In 2006, ICPSR began the process of consolidating its archival storage infrastructure, with four goals:

  • Moving digital content from tape to disk
  • Discarding old distribution paper content
  • Transfering archival paper to a storage facility for safekeeping
  • Closing the three off-site locations.

Since the completion of that successful project, ICPSR has been adding redundancy to its archival storage processes by identifying multiple and varied methods and locations to back up its holdings. All told, the ICPSR data collection is only between five and six terabytes, a relatively small size in the world of digital preservation.

ICPSR currently maintains six copies of its data (and requires that any off-site backup be encrypted):

  • Two copies held locally
  • One copy put on tape once a month onsite
  • One copy elsewhere in Michigan
  • One copy in Amazon's storage cloud
  • One copy with the DuraSpace cloud storage system DuraCloud

Cloud Computing

The use of cloud computing is a recent addition to ICPSR's digital preservation practices, and one that has proven efficient and effective not only for archival storage, but also to back up the Web-based data-delivery system. (Read more about this at Technology at ICPSR, the blog maintained ICPSR's Computing and Network Services.)

ICPSR first started using Amazon's cloud services in 2009 to back up the Web delivery system. More recently, ICPSR was one of 10 organizations that used DuraSpace's DuraCloud services during beta testing. The advantages of using cloud-based services for ICPSR's archival storage are:

  • It increases the geographical diversity of ICPSR's backups
  • In the case of DuraCloud, it allows the convenience of one billing relationship with the ability to store content in the cloud-storage services of multiple providers
  • It is relatively easy to increase or decrease the amount of storage space available
  • It is fairly easy and cheap to maintain and synchronize multiple copies of data
  • Larger public cloud service providers are at least as secure as most local data centers