Budgeting for ICPSR Professional Curation Services
Last updated 5/9/2024
Steps to Request a Cost Estimate for ICPSR Curation Services
- Review data submission requirements (below).
- Register your proposed data submission and request a cost estimate – Complete this form to request a cost estimate and notify ICPSR of your potential data deposit. ICPSR will supply a cost estimate covering a one-time, final submission for a set of related data and documentation files. (optional) Designate ICPSR as the intended repository in your Data Management and Sharing Plan (DMS Plan). You may include ICPSR curation services in your grant application budgeti. If you are writing a grant application, you may designate ICPSR (NIH grantees may also consider designating one of the NIH-funded archives hosted by ICPSR — Data Sharing for Demographic Research, National Computerized Data on Aging, and National Addiction and HIV Data Archive Program) in your data management plan if the data fall within the ICPSR Collection Development Policy.
- Use the ICPSR NIH DMS Plan template (.docx) to create your DMS Plan -OR- use the copy-and-paste language provided (highlighted in yellow in the document).
- (optional) Include the estimate received from Step #2 (above) in the proposed budget. Please allow 10 business days from when you submit the form to when you can expect a staff person to reach out with the estimated cost.
Data Submission Requirements
ICPSR accepts various formats and data types and can release data using different access methods (including restricted access). A completed data deposit form must accompany all submissions. Specific requirements for common data types are listed below.
Quantitative, tabular data requirements
The submission may include up to 5 data files and 1,000 variables and be within 30 GB and 1,000 files. Please inquire about larger studies. For additional information, please see the Guide to Social Science Data Preparation and Archiving.
- Submit all tabular data files in SPSS, SASii, or Stata format
- Include all variable and value labels. They must be complete and approximate the meaning of the related questions and responses
- Include codebook(s) (or similar documentation) describing all variables in the data
- Fully describe research methods and practices (including the informed consent process, setting, and selection of participants or samples)
Optional Recommendations for Quantitative, Tabular Data:
Supply Additional Documentation/Files:
- Include original survey/data collection instrument(s)
- IRB Approval and sample Informed Consent statement
- Include raw and derived variables (and coding used to produce derived variables), ensuring variables related to published results are included
- Include design variables (stratum, cluster, final weights), linking variables (where files can be combined)
- Include all citations for publications related to data submission
Variables and Variable Labeling:
- Each variable name is less than 32 characters
- Use a unique variable label for each variable
- Approximate question text in the label
- Do not use periods (.) or dollar signs ($) within labels
- Do not start a label with a number
- Do not contain spaces within labels (use – or _)
- Variable label length should not exceed 256 characters, when possible
Values and Value Labeling:
- Numeric codes should not be greater than 10 digits
- Use a unique value label for each discrete category
- Omit value labels when they have non-integer values
- Omit value labels for date and time variables
- Omit value labels for string variables, if possible
- Value label length should not exceed 120 characters, when possible
Missing Data:
- Create consistent missing data codes/values that are used across all variables
- Recode any alpha-numeric missing data codes to numeric codes (applies to SAS and Stata data files)
Column Widths:
- For all numeric variables, 15 characters or less
- For string variables, 244 characters or less**
Qualitative, text data requirements
The submission may include up to 250 pages (total) of text files in readily accessible text format (e.g., .txt, .doc). For additional information, see the Guide for Sharing Qualitative Data (pdf).
- Fully describe research methods and practices (including the informed consent process, setting of interviews, and selection of interview subjects)
- Include any interviewer instructions
- Include any data collection instruments such as interview protocols, questions
- Include a version of the data where direct identifiers are removed from the data and documentation (e.g., name, address, etc.) and replaced clearly and consistently (e.g., John -> [person1] or [patient1] or [father])
- Include a deidentified interview roster (to help ensure the files are complete and assist the user in navigating the text files)
Payment for Curation Services
ICPSR will review all deposit submissions to ensure the data are complete and meet the requirements of a standard submission, and will provide an invoice for curation services before we begin our work.
Please be aware that ICPSR’s costs may change over time. Following our evaluation of the data submission, ICPSR may revise the cost estimate. If ICPSR curation services exceed the allocated budget, we will discuss alternative data archiving options.