ICPSR accepts various formats and data types and can release data using different access methods (including restricted access). A completed data deposit form must accompany all submissions. Specific requirements for common data types are listed below.

Quantitative, tabular data requirements

The submission may include up to 5 data files and 1,000 variables and be within 30 GB and 1,000 files. Please inquire about larger studies. For additional information, please see the Guide to Social Science Data Preparation and Archiving.

  • Submit all tabular data files in SPSS, SASi, or Stata format
  • Include all variable and value labels. They must be complete and approximate the meaning of the related questions and responses
  • Include codebook(s) (or similar documentation) describing all variables in the data
  • Fully describe research methods and practices (including the informed consent process, setting, and selection of participants or samples)

Optional Recommendations for Quantitative, Tabular Data

Supply Additional Documentation/Files:

  • Include original survey/data collection instrument(s)
  • IRB Approval and sample Informed Consent statement
  • Include raw and derived variables (and coding used to produce derived variables), ensuring variables related to published results are included
  • Include design variables (stratum, cluster, final weights), linking variables (where files can be combined)
  • Include all citations for publications related to data submission

Variables and Variable Labeling

  • Each variable name is less than 32 characters
  • Use a unique variable label for each variable
  • Approximate question text in the label
  • Do not use periods (.) or dollar signs ($) within labels
  • Do not start a label with a number
  • Do not contain spaces within labels (use – or _)
  • Variable label length should not exceed 256 characters, when possible

Values and Value Labeling

  • Numeric codes should not be greater than 10 digits
  • Use a unique value label for each discrete category
  • Omit value labels when they have non-integer values
  • Omit value labels for date and time variables
  • Omit value labels for string variables, if possible
  • Value label length should not exceed 120 characters, when possible

Missing Data

  • Create consistent missing data codes/values that are used across all variables
  • Recode any alpha-numeric missing data codes to numeric codes (applies to SAS and Stata data files)

Column Widths

  • For all numeric variables, 15 characters or fewer
  • For string variables, 250 characters or fewer**

**Some statistical packages allow longer string variables, but when the files are converted other packages those values are truncated

Qualitative, text data requirements

The submission may include up to 250 pages (total) of text files in readily accessible text format (e.g., .txt, .doc). For additional information, see the Guide for Sharing Qualitative Data (pdf).

  • Fully describe research methods and practices (including the informed consent process, setting of interviews, and selection of interview subjects)
  • Include any interviewer instructions
  • Include any data collection instruments such as interview protocols, questions
  • Include a version of the data where direct identifiers are removed from the data and documentation (e.g., name, address, etc.) and replaced clearly and consistently (e.g., John -> [person1] or [patient1] or [father])
  • Include a deidentified interview roster (to help ensure the files are complete and assist the user in navigating the text files)

i 4-bit SAS data files only; Apply SAS formats; Submit formatted data, format library, & proc format code