Data Enhancement

The goal of data processing and enhancement is to render data usable to researchers interested in accessing them after they are deposited in a repository. In the social sciences, archives add value to data by making them easier to use for secondary analysis. There is wide variation in archival practices, often depending on the condition of the data to be archived and the goals of a particular repository or discipline.

At ICPSR, once data are submitted in a submission information package, the data pass through a "pipeline" for processing and enhancement.

The specific steps depend on the unique characteristics of each dataset, but in general, ICPSR data processors always perform the following procedures:

They may also do the following:

  • Recode variables to address confidentiality concerns

  • Check for undocumented/out of range codes

  • Add question text to variables

  • Create variable labels

  • Create value labels

  • Identify and address foreign language characters

  • Adjust format widths

  • Optimize file size

  • Standardize missing values

  • Check for consistency and skip patterns

  • Make online analysis version with question text

  • Add variables to the Social Science Variables Database

  • Gather citations to related publications for the Bibliography of Data-Related Literature