Depositing in the WEAI Data and Code Repository
Replication packages must include the following elements:
- A summary file (preferably in simple TXT format or PDF) describing the contents of the replication package. It should explain the role and function of each file included and detail all software necessary to run the code as well as any additional add-on packages required. A simple explanation should be included providing instructions for how someone should run the code to generate the results as well as an explanation for where the results can be found once the code is finished. We require that this file should follow the format as specified in the README Template provided by the Social Science Data Editors.
- All data files and code necessary to produce the main tables and figures in the manuscript should be included.
- Ideally authors should provide the code used to clean and organize their data from original data files. When that is not feasible, authors should provide a clear description of their process for doing so including any decision criteria for dropping or excluding data, imputing values or other data transformations performed between the original data files and the final data file. When base data files are not included, authors should explain the origin of those base data files and explain how another researcher could access them.
- For any projects that involve the creation of an original data set via surveys, experiments or similar methods, authors should provide full details on the methods used in that process. This involves providing data gathering instruments, experiment programs, instruction scripts and so on including a brief description for how these materials were used in gathering data.
- For any projects which involve simulations or computational elements, the code generating those calculations should be included as well as an explanation for how one would run the code.
- It is expected that some data sets may be proprietary or unable to be publicly archived. If that’s the case, this should be explained when submitting. In lieu of providing the data, authors should provide clear descriptions regarding the process to obtain the data or how other researchers might be able to obtain the data as well as clear explanations for how the data were then processed into the form used. Code for conducting the work should still be included and explained even if the data itself cannot be uploaded.
General instructions for data labeling:
- Each variable in the data collection should have a set of exhaustive, mutually-exclusive codes.
- Variable labels and value labels should clearly describe the information or question recorded in that variable.
- Missing data codes should be defined.
- Identifying information should be removed from the data to protect confidentiality.
- Program code and command files should be annotated to facilitate replication and ensure clear correspondence between code and figures, tables, and analyses in the article.
When you are ready to upload your replication package:
- You can verify in advance whether your package will satisfy our requirements. You can access a copy of the checklist our Data Team will go through to verify that a replication archive is complete. Prior to uploading your data archive, please review this checklist and make certain that someone reviewing your archive would be able to check “Yes” to the relevant questions. If there are any questions where “No” would be selected, please make certain that your README file explains the reasons for that omission.
- Once you have pre-verified that your package will satisfy our requirements, select “Start your deposit” and sign in to the ICPSR website.
- You will be directed to your ICPSR Workspace, where you will see a list of previously created projects.
- Click “Create New Study” and then “Proceed.”
- Provide a descriptive project title. Title should be “[ECIN or COEP] Replication Package for [Title of Article]”.
- Fill out study-level metadata as appropriate including the following:
- The list of principal investigators (authors). Please ensure that all authors have affiliations (if not affiliated: “Independent Researcher”).
- A descriptive summary of your project. This can be a copy of the abstract from your article, a note that this is data and/or code accompanying the article, and/or text that clearly allows people to understand the purpose of these materials independently.
- Select ICPSR subject terms (e.g., “Machine Learning” or “Randomized Control Trial”).
- Select JEL classification(s) (should be the same as article).
- Select Time periods to which the data refer to.
- Select geographic coverage areas where the data refer to.
- Fill in the manuscript number (your ScholarOne tracking number as assigned by the editorial office); this will allow us to properly connect the repository with the manuscript.
- Other metadata fields to complete include universe and data type as relevant to your project. When only code is produced, authors should choose data type = program source code. The Methods sections are particularly relevant for survey or experimental data: response rates, sampling rates, etc.
- Once the “Describe Data” page has been completed, scroll down and click on “Save & Continue” to proceed.
- Upload data, computer programs, sets of computer program codes, extracts of existing data files, and supporting documentation necessary to replicate the results of your analyses without any additional information from the author(s).
- Click on “Upload” to add files, folders, or Zip archives. WEAI requires that you import zip files instead of uploading as-is.
- Please upload the README (in PDF or TXT format) as the very first file ensuring that it can be found easily by those browsing the archive.
- If the uncompressed contents of the deposit (the unzipped size of the zip file) are larger than 30GB, or if you have more than 10,000 files in your deposit, please email [email protected] with your project number to identify an upload solution.
- After clicking the appropriate upload option, drag and drop your files or choose files through a file selection window. Don’t leave the page while files are being uploaded.
- You have the ability to “Download All” files, as well as deleting files under the “Actions” column.
- You should see a list of files and directories in your ICPSR project workspace. The ideal structure includes:
- No redundant directories: the first items you should see are the README and any subdirectories.
- There should be no zip files!
- The structure should be as you last ran the code. What you see in the deposit interface is what others will see once it is published.
- Once the “Deposit Data” page has been completed, click on “Save & Continue.”
- Click on “Upload” to add files, folders, or Zip archives. WEAI requires that you import zip files instead of uploading as-is.
- On the “Deposit Settings” page, fill-out the following:
- For “ICPSR Archive Collection” click your cursor in the text box and select “WEAI Repository.”
- Select “No” for the “Study Curation Request” question.
- Select “No” to the disclosure risk questions.
- Fill-in the manuscript number.
- Enter the name of the WEAI journal to which you submitted.
- Once you have completed all of the above, click on “Save & Continue.”
- On the Deposit Agreement page, click on the radio button for deposit approval.
- Click on “View & Sign Deposit Agreement” and then type in your name to electronically agree to the deposit terms.
- Next, select the appropriate response to the ADA Accessibility question.
- Lastly, select a license to apply to the data collection.
- Once you “Submit for Review,” journal staff will be in touch after reviewing your submission. Note, revisions may be requested. Before uploading your final paper for production, add the data citation provided by ICPSR to the references section, in addition to a Data Availability Statement conforming to one of these standard templates.