Data Submission Requirements

This page summarizes ICPSR data submission requirements, including required data files, documentation, and related study materials. ICPSR accepts various formats and data types and can release data using different access methods (including restricted access). A completed data deposit form must accompany all submissions. Specific requirements for common data types are listed below. For a final pre-submission task list, use the Depositor Checklist.

Deposits should include the data files, documentation, and study-level information necessary for others to independently understand, evaluate, and reuse the data. Requirements may vary by data type, project, funder, archive, or access conditions. After reviewing requirements, see Prepare Your Data for Deposit for the full preparation workflow or use the Depositor Checklist before submitting.

At minimum, deposits should include:

  • Data files containing the research data to be archived or shared
  • Documentation explaining the data, methodology, variables, and file structure
  • Study-level information such as title, investigators, funding, methodology, sample, universe, unit of analysis, and data collection dates

Quantitative, Tabular Data

ICPSR prefers quantitative data in statistical package formats such as R, SASi, SPSS, or Stata. Delimited or ASCII files may also be accepted when accompanied by appropriate documentation, such as a data dictionary, codebook, or setup syntax.

  • Include all variable and value labels. They must be complete and approximate the meaning of the related questions and responses
  • Include codebook(s) (or similar documentation) describing all variables in the data
  • Fully describe research methods and practices (including the informed consent process, setting, and selection of participants or samples)

The submission may include up to 5 data files and 1,000 variables and be within 30 GB and 1,000 files. Please inquire about larger studies. After reviewing the required ICPSR data submission materials listed here, submitters may also consult the optional recommendations for guidance on making data and documentation more complete, well organized, and easier to review and reuse. For additional information, please see the Guide to Social Science Data Preparation and Archiving.

Qualitative, Text Data

Qualitative data may be submitted in formats such as plain text, rich text, Microsoft Word, or OCR-readable PDF files.

  • Fully describe research methods and practices (including the informed consent process, setting of interviews, and selection of interview subjects)
  • Include any interviewer instructions
  • Include any data collection instruments such as interview protocols, questions
  • Include a version of the data where direct identifiers are removed from the data and documentation (e.g., name, address, etc.) and replaced clearly and consistently (e.g., John -> [person1] or [patient1] or [father])
  • Include a deidentified interview roster (to help ensure the files are complete and assist the user in navigating the text files)

The submission may include up to 250 pages (total) of text files in readily accessible text format (e.g., .txt, .doc). For additional information, see the Guide for Sharing Qualitative Data (pdf).


i 4-bit SAS data files only; Apply SAS formats; Submit formatted data, format library, & proc format code

Documentation should be complete enough for future users to understand how the data were collected, structured, processed, and analyzed. Depending on the study, submit materials such as:

  • Codebooks or data dictionaries
  • User guides or README files
  • Questionnaires, survey instruments, or interview protocols
  • Interviewer instructions
  • Project summaries or final reports
  • Summary statistics, if available
  • Variable and value labels
  • Descriptions of computed or derived variables
  • Syntax or code needed to reproduce, merge, weight, or interpret the data
  • Deidentified interview rosters for qualitative data, if applicable
  • Related publications or citations
  • IRB approval documentation or sample informed consent language, when appropriate

Documentation is most useful when it integrates question text with variable information where possible. For qualitative text data, include a version of the data and documentation with direct identifiers removed and replaced clearly and consistently. Check the ICPSR Metadata Documentation Portal for guidance on what to include. Providing quality study-level metadata in the deposit form helps us connect users with your data.

If the deposit is associated with a grant, funder requirement, journal policy, or data management plan, include the information needed to document the project and support compliance, such as:

  • Funding source and grant number
  • Principal investigator names and affiliations
  • Project description, goals, main topics, and methodology
  • Data collection dates
  • Sample description, universe description, and unit of analysis
  • Related project or study website, if available

Review any applicable funder, grant, journal, or data management plan requirements before submitting data.

Before depositing, identify any timing considerations that may affect review, publication, or release, including:

  • Publication or promotion deadlines
  • Funder or grant-related data-sharing deadlines
  • Journal replication or data availability requirements
  • Planned release dates or embargo needs
  • Whether the deposit updates a previously distributed ICPSR study

If the deposit is an update to a previous ICPSR study, provide the ICPSR study number and describe the relationship between the new deposit and the earlier data.

Before submitting, review data and documentation for direct identifiers, indirect identifiers, sensitive questions, and contextual details that could increase disclosure risk. Remove direct identifiers before deposit whenever possible, including names, addresses, phone numbers, and other personally identifiable information (PII).

Data sharing must be consistent with:

  • Participant consent
  • IRB requirements
  • Terms of use for existing data sources
  • Legal, contractual, or institutional restrictions

If the data contain identifiers or may require restricted access, identify this during the deposit process. ICPSR reviews deposited data for disclosure risk and may work with depositors to create public-use and/or restricted-use versions when appropriate. This might involve creating public and/or restricted-use versions of your data. More details are available at Restricted-Use Data Management at ICPSR and Preserving Respondent Confidentiality.

Before depositing, confirm that you have the right to share the data and related materials. Review:

  • Informed consent documents
  • IRB documentation
  • Terms of use for data obtained from existing sources
  • Funder, grant, journal, or institutional requirements
  • Restrictions on redistribution or public access
  • Whether public or restricted access is appropriate

If restricted access is needed, plan to discuss dissemination with ICPSR and collaborate on any required terms of use or access agreements.