Life of a Dataset
The Life of a Dataset presentation began as a tour of the Perry Building, home of ICPSR, during the Welcome Reception of the 2011 Biennial OR Meeting held in Ann Arbor, October 5-7, 2011. The goal was to familiarize ORs and others with the staff at ICPSR and the work that they do to bring data to the data user community. The tour was presented as a poster session at ICPSR's 50th Anniversary Open House on Thursday, June 7, 2012.
The features of ICPSR's process that were presented in the Life of a Dataset Tour are based on the work of the members-only General Archive. However, the topical archives' processes follow the same pattern and in some cases include value-added features.
Visit our YouTube channel to view a series of webinars on the Life of a Dataset presented by ICPSR staff.
The Acquisition unit is charged with reaching out to Principal Investigators (PIs), funding agencies, and other data producers to ask them to deposit their digital data in the ICPSR archive. This unit provides the Data Deposit Form and supports its use. It also offers support on Data Management Plans. The Guide to Social Science Data Preparation Manual provides the research community with best practices for archiving their data.
The Archives receive raw data and prepare them for archiving and dissemination. Staff members use best practices to insure that confidentiality and security is maintained and that the digital data are preserved for future generations of researchers.
The Processing Supervisor reviews the study materials as submitted and creates a processing plan. The data and documentation are reviewed and disclosure risk review is performed. Through the processing, the Supervisor or his/her delegate works with the Processor on maintaining respondents' confidentiality, reviews the Processor's work, and performs a final review of all data and documents prior to release of the study.
The Processor, working in ICPSR's Secure Data Environment (SDE), thoroughly reviews data for formatting and documentation; ensures that codebook-defined skip patterns are observed; standardizes missing data and other values; creates variable and value labels; and adds processing information to the study's metadata. Using the batch processing system Hermes, the Processor produces a suite of products including codebooks, and setup and system files for SAS, SPSS, Stata, and R. Finally, the Processor gathers the final data, metadata, and documentation, prepares the history file for preservation, and performs a final review before releasing the study using the Turnover batch processing system.
The Collection Delivery unit's responsibilities include building the documentation for files, building and maintaining the ICPSR website, releasing all studies to the Web, and announcing their release.
Metadata Editors work with the Processors to build a document set and description for the dataset, ensuring that these files conform to ICPSR standards and are well organized.
Release Coordinators receive the study that has gone through Turnover, review the files, and approve the release of the study.
ICPSR is committed to ensuring that digital data available through the website will be preserved for future generations of researchers and has incorporated appropriate steps into the workflow to ensure that this commitment is met.
Computing and Network Services
The tools used to deposit, process, and deliver digital data are created and maintained by the Computer and Network Services (CNS) unit of ICPSR. This unit also supports digital preservation and data security.