Optimal Methods and Strategies for Reproducible Research: How to Publish Faster with Less Stress!

Instructor(s):

Data analysis is technically demanding, time consuming, and plain hard work. How did you learn to organize and manage your research? Too often this is done haphazardly, perhaps in response to problems such as losing a critical file or finding an error in your research. The dataset must be prepared, statistical analysis performed, and results incorporated into papers. Invariably, reviewers want revisions that require additional passes through the data. Increasingly, journals expect authors to distribute the datasets and the script files that produced the paper’s results. To do this efficiently and to avoid damaging errors requires a workflow that anticipates producing reproducible results that are accurate. This workshop considers the entire process of research from and presents a workflow that is guided by the demands of producing reproducible and accurate results while working as quickly and efficiently as possible. Using this approach, your work goes faster, your findings are more trustworthy, and your results are reproducible. Topics include:

  • Planning, organizing, and documenting your work.
  • Methods to manage, document, and preserve digital files.
  • Strategies for computing that support reproducibility, document provenance, and facilitate revision.
  • Writing robust programs that use automation to increase accuracy and efficiency.
  • Techniques for preparing datasets that include consistent names and labels, metadata for documentation, and verification that variables are correct.
  • Strategies for sophisticated data analyses that are reproducible and efficient.
  • Methods for accurately and quickly incorporating results into a paper while maintaining the provenance of the findings.
  • Easy methods that simplify the revisions of papers.
  • Ways to prevent the catastrophic loss of files.

The course focuses on strategies and rules that work with Stata, R, SPSS, SAS or any statistical package. Examples in lecture illustrate the ideas using Stata, but the ideas can be applied easily to other languages. In lab, you can explore methods from lecture using your laptop, discuss how the ideas from lecture can be applied to your work with the software you prefer, and meet with the instructor individually or in groups to discuss specific research applications or challenges. A handbook is provided to help you adapt materials from the workshop to your own research. The class is ideal for

  • Experienced researchers who want to improve efficiency and develop a reproducible workflow
  • Scientists managing multiple projects
  • Researchers and data managers in academic or nonacademic research centers
  • Students starting their dissertation or preparing a paper for submission
  • Anyone hoping to become more efficient and to work with less stress!

Software: While Stata is used to provide examples in lecture, you can use programs such as SAS or R during the workshop. A temporary license for Stata is provided. If you want to use other software, you should installed it on your laptop before the workshop.

SPECIAL FEE: Participants who attend the second Four-Week Session of the 2018 ICPSR Summer Program (July 23, 2018 to August 17, 2018) or those who have attended any Four-Week Sessions of the ICPSR Summer Program in the past are eligible for a special discounted fee of $900 to attend this five-day short workshop. To receive this special discounted fee, please email the Summer Program at sumprog@icpsr.umich.edu.

Fee: Members = $1200; Non-members = $2200

Tags: workflow, reproducibility, publishing

Course Sections