Repository Operations

As a repository, ICPSR adheres to operational standards that demonstrate it is organizationally, procedurally, and technologically sound as a trustworthy data custodian. This page describes the processes and standards we take related to acquiring, enhancing, describing, preserving, and disseminating digital social science data.

ICPSR processes are based on the Reference Model for an Open Archival Information System (OAIS), an ISO standard that provides the functional framework for sustaining digital objects in managed repositories. The following image provides a detailed flowchart of ICPSR’s processes with respect to the OAIS Reference Model.

Ingest

At ICPSR, all data are deposited via an electronic Deposit Form. Files uploaded via this secure system are given unique deposit IDs and moved to the appropriate area for further processing. Data producers sign off on the deposit and ICPSR sends confirmation of the receipt of materials.

The ICPSR Collection Development Policy provides guidance to ICPSR staff in selecting data for long-term archiving. The document lists selection criteria, major areas of emphasis, and appraisal considerations.

Curation

After acquisition, data at ICPSR are enhanced—“curated”—with meaningful information to make it complete, self-explanatory, and usable for future researchers.

ICPSR’s Curation Levels (pdf) may include:

  • Review data for confidentiality issues
  • Generate multiple data formats for dissemination and preservation
  • Create documentation compliant with the DDI specification
  • Recode variables to address confidentiality concerns
  • Check for undocumented/out-of-range codes
  • Add question text to variables
  • Create variable labels
  • Create value labels
  • Standardize missing values
  • Identify and address foreign language characters
  • Adjust format widths
  • Create a metadata record
  • Check for consistency and skip patterns
  • Make online analysis version with question text
  • Gather citations to related publications for the Bibliography of Data-Related Literature

Metadata

Metadata (i.e., information about our data collections that help others discover, understand, and use them) are essential for maximizing the usefulness of data. Because it is often impossible for secondary researchers to ask questions of the original data producers, metadata are the de facto form of communication between them. Comprehensive metadata standardizes how the data are described, enables a deeper comprehension of a dataset, facilitates data searches by variables, and offers a variety of display options on the web.

ICPSR creates metadata primarily from information supplied by data depositors. Metadata creation, enhancement, and quality review is a team effort and involves staff from across all of ICPSR. All metadata records are reviewed and approved by Metadata & Preservation unit staff, who check for adherence to standards and advise on all metadata-related activities during the data curation lifecycle.

Learn more about ICPSR’s metadata documentation and accessing metadata records.

Preservation

ICPSR is committed to digital preservation — the proactive and ongoing management of digital content to lengthen the lifespan and mitigate against loss. ICPSR preserves its data resources for the long-term, guarding against deterioration, accidental loss, and digital obsolescence. ICPSR has a decades-long track record of reliably storing research data. A key part of ICPSR’s mission is to preserve the data we steward, including the data archived at our founding in 1962.

ICPSR’s preservation efforts are outlined in the Digital Preservation Policy Framework, which informs processes across the organization. Key responsibilities are distributed among multiple ICPSR units:

  • The Metadata and Preservation unit promotes good practice to align ICPSR with the digital preservation community.
  • Curation staff ensure that core preservation activities are completed and documented, as data and associated files are acquired, processed, and prepared for release.
  • The Information and Technology unit ensures secure, automated, and redundant storage.
  • Leadership and governance groups guide policies and oversight.

Data are securely stored in multiple formats and locations, with evolving strategies to ensure resilience over time.

Dissemination

ICPSR disseminates data to researchers, students, policymakers, and journalists around the world based on its Access Policy Framework. Users at member institutions may download all data directly from ICPSR. Many datasets are freely available to the public through the thematic collections. Access to data is sometimes restricted and users are expected to adhere to norms for responsible use.

Users downloading data or analyzing them online are expected to comply with standards of responsible use. Before gaining access to data, users are asked to read a Responsible Use Statement that says the following:

  • The datasets are to be used solely for statistical analysis and reporting aggregated information.
  • The confidentiality of research participants is to be guarded in all ways.
  • Anything that can potentially breach participants’ confidentiality is to be reported promptly to ICPSR.
  • The data are not to be redistributed or sold to others without the written agreement of ICPSR.
  • The user will inform ICPSR of the use of the data in books, articles, and other forms of publication.

Each data collection receives a data citation with a unique, persistent identifier Digital Object Identifier, like this example:

Goldin, Claudia, and Katz, Lawrence. The 1915 Iowa State Census Project. Inter-university Consortium for Political and Social Research [distributor], 2010-12-14. https://doi.org/10.3886/ICPSR28501.v1

Citations for ICPSR data can be found in the following locations:

  1. Study descriptions that appear on the website
  2. File manifest
  3. PDF study description file

Both the file manifest and the PDF study description file are automatically included with every download.