MDRC's The Higher Education Randomized Controlled Trials Restricted Access File (THE-RCT RAF), United States, 2003-2024 (ICPSR 37932)

Version Date: Jan 14, 2025 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
John Diamond, MDRC; Michael J. Weiss, MDRC; Colin Hill, MDRC; Austin Slaughter, MDRC; Stanley Dai, MDRC


Version V5 ()

  • V5 [2025-01-14]
  • V4 [2023-01-31] unpublished
  • V3 [2023-01-04] unpublished
  • V2 [2021-06-07] unpublished
  • V1 [2021-03-10] unpublished
Slide tabs to view more

The Higher Education Randomized Controlled Trial (THE-RCT) study aims to capitalize on existing data from postsecondary education RCTs to foster substantive and methodological scholarship and encourage teaching and learning opportunities. The cornerstone of THE-RCT is a restricted access file (RAF). The initial version contains individual-participant data from more than 25 of MDRC's higher education RCTs covering 50 institutions and over 50,000 students. The data were originally collected as part of different randomized controlled trial evaluations of a variety of higher education interventions. The data were collected for different student samples, at different times, and in different locations for each study.

The data were collected from four data sources: 1. Baseline: Baseline student demographic data (e.g., gender, race/ethnicity, age, etc.) were gathered, either via a survey administered to students upon joining the study (but prior to random assignment) or from study colleges' administrative records; 2. College Transcript: Student transcript data (e.g., enrollment, credits attempted, credits earned, GPA) were provided by the study colleges or state higher education agencies; 3. College Credential Attainment: Student credential attainment data were provided by the study colleges or state higher education agencies; 4. National Student Clearinghouse: Student enrollment and credential attainment data were provided by the National Student Clearinghouse via their StudentTracker database. This includes enrollment and credential attainment data at colleges beyond the colleges where the study took place.

The RAF contains student-level data, including baseline demographics (e.g., gender, race/ethnicity), outcomes (e.g., enrollment, credits earned, credentials), an indicator of experimental group (e.g., program or control group), and study variables (e.g., a variable that allows users to link to the RCT-level database).

Diamond, John, Weiss, Michael J., Hill, Colin, Slaughter, Austin, and Dai, Stanley. MDRC’s The Higher Education Randomized Controlled Trials Restricted Access File (THE-RCT RAF), United States, 2003-2024. Inter-university Consortium for Political and Social Research [distributor], 2025-01-14.

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
Arnold Ventures, United States Department of Education. Institute of Education Sciences (R305A190161)

University Campus

This data collection may not be used for any purpose other than statistical reporting and analysis. Use of these data to learn the identity of any person or establishment is prohibited. To protect respondent privacy, some of the data files in this collection are restricted from general dissemination. To obtain these restricted files researchers must agree to the terms and conditions of a Restricted Data Use Agreement.

Inter-university Consortium for Political and Social Research

2003 -- 2024
2003 -- 2024
  1. For additional information on the Higher Education Randomized Controlled Trial study, please visit the MDRC website.

The purpose of this collection is to capitalize on existing data from postsecondary education RCTs to foster substantive and methodological scholarship and encourage teaching and learning opportunities.

Sampling varies by RCT, see PI document RCT-Level Database for more information.


US college students

School, Student

Outcome data were collected for nearly all students in each RCT. Typically, fewer than 1 percent of students who were randomly assigned were "dropped" from an RCT due to, for example, a student withdrawing their consent to collect outcome data. See PI's documentation RCT-Level Database for the number of students randomly assigned in each RCT and the number of students in the analytic sample, and see PI's Codebook for details on missing data.



2025-01-14 The dataset has been updated to include data for the date range of 2003-2024. The previous dataset covered the date range of 2003-2019. The accompanying documentation has also been updated to reflect these changes.

2023-01-31 Documentation has been updated to correct minor errors.

2023-01-04 Missing values and a new set of participation variables have been added to the datasets. All documentation has been updated to reflect the data changes.

2021-06-07 The title and PIs for this study have been updated, and ICPSR documentation has been updated to reflect that.


The dataset includes a variable that indicates each student's probability of being assigned to the program group (special care needs to be taken in multi-arm trials). This variable can be used to create weights to account for differences in the probability of assignment to the program both within and between studies.



  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

  • One or more files in this data collection have special restrictions. Restricted data files are not available for direct download from the website; click on the Restricted Data button to learn more.