Log In/Create Account

Longitudinal Study of American Youth, 1987-1994, and 2007 (ICPSR 30263) RSS

Principal Investigator(s):


The Longitudinal Study of American Youth (LSAY) was designed to examine the development of: (1) student attitudes toward and achievement in science, (2) student attitudes toward and achievement in mathematics, and (3) student interest in and plans for a career in science, mathematics, or engineering, during middle school, high school, and the first four years post-high school, and to estimate the relative influence of parents, home, teachers, school, peers, media, and selected informal learning experiences on these developmental patterns. At the time of the original award, it was not known if support would be available beyond the initial four-year period of support and the first years were designed to create a synthetic cohort that would extend from grade seven through the first year after high school. To allow the construction of this synthetic cohort, a two-cohort design was developed.

The older LSAY cohort, Cohort One, consisted of a national sample of 2,829 tenth-grade students in public high schools throughout the United States. These students were followed for an initial period of seven years, ending four years after high school in 1994. Cohort Two, consisted of a national sample of 3,116 seventh-grade students in public schools that served as feeder schools to the same high schools in which the older cohort was enrolled. These students were followed for an initial period of seven years, concluding with a telephone interview approximately one year after the end of high school in 1994.

Beginning in the fall of 1987, the LSAY collected a wide array of information from each student, including: (1) a science achievement test and a mathematics achievement test each fall, (2) an attitudinal and experience questionnaire at the beginning and end of each school year, (3) reports about education and experience from all science and math teachers in each school, (4) reports on classroom practice by each science and math teacher serving an LSAY student, (5) an annual 25-minute telephone interview with one parent of each student, and (6) extensive school-level information from the principal of each study school.

In 2006, the NSF funded a proposal to re-contact the original LSAY students (now in their mid-30?s) to resume data collection to determine their educational and occupational outcomes. Through an extensive tracking activity (described in Kimmel and Miller, 2008), more than 95 percent of the original sample of 5,945 LSAY students were located or accounted for. A new eligible sample of approximately 5,000 students was defined and these young adults were asked to complete a survey in 2007.

The public release data files include the information collected from the national probability sample students, their parents, and the science and mathematics teachers in their schools during the initial seven years, beginning in the fall of 1987, as well as the data collected in the 2007 questionnaire.

The original two-cohort, two-file data structure reflected the initial period of data collection, but it was awkward for users that wanted to compare the two cohorts or to combine them for various analyses. The merged data file includes a variable to indicate the original cohort, allowing a user to repeat or extend any analysis conducted with the previous LSAY release file, but the naming of the variables in the merged file has been revised to correct dual or conflicting variable names and indicators. Equally important, the new merged file structure will facilitate the annual release of new cycles of data collection through the addition of variables to the base system.

Analysts are encouraged to read the LSAY user guide before doing any data analysis.

Access Notes

  • These data are available only to users at ICPSR member institutions. Because you are not logged in, we cannot verify that you will be able to download the data.

  • This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.


WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.

DS0:  Study-Level Files
DS1:  LSAY Merged Cohort File (Base File) - Download All Files (755,684 KB)
DS2:  Teacher Background Questionnaires Years 1-5 - Download All Files (60,699 KB)
DS3:  Teacher Background Questionnaires Year 6 - Download All Files (57,235 KB)
DS4:  Principal Questionnaire Fall 89 - Download All Files (55,537 KB)
DS5:  Principal Questionnaire Fall 93 - Download All Files (55,258 KB)

Study Description


Miller, Jon D. Longitudinal Study of American Youth, 1987-1994, and 2007. ICPSR30263-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011-04-01. doi:10.3886/ICPSR30263.v1

Persistent URL:

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote XML (EndNote X4.0.1 or higher)


This study was funded by:

  • National Science Foundation (MDR-8550085, REC96-27669, RED-9909569, REC-0337487, DUE-0525357, DUE-0856695, DRL-0917535)

Scope of Study

Subject Terms:   academic achievement, achievement tests, aptitude, high school students, junior high school students, mathematics, parents, postsecondary education, public schools, school age children, school principals, schools, science, science education, secondary education, student attitudes, teacher attitudes, teacher morale, teachers, teaching (occupation)

Geographic Coverage:   United States

Time Period:  

  • 1987--1994
  • 2007

Date of Collection:  

  • 1987--1994
  • 2007

Unit of Observation:   individual

Universe:   The methodology and sampling is described in the LSAY user guide.

Data Types:   survey data

Data Collection Notes:

ICPSR created a unique sequential record identifier variable named CASEID for use with online analysis.

This is a new data deposit. Updated versions will be added each year, as data collection continues.

Several variables in the LSAY base file required formats to be widened so that more unique values were present in the data.

Variables AB14V, AB19N, AB19O, AB19Q, and AB19T had value -99 recoded to -95 per PI request.

Due to the size of the file, a codebook for Part 1 was not created. The issue is being checked and if a viable solution is possible, an update will be released.

For further information about LSAY see the Longitudinal Study of American Youth Web site (http://www.lsay.org).


Sample:   The sampling scheme for the base year of the LSAY was a two-stage stratified probability sample. The United States was stratified by four geographic regions and by three levels of urban development (central city, suburban, and nonmetropolitan) to produce a total of 12 strata. Stage I involved the selection of schools to participate in the study. Stage II was the random selection of 60 students within each school selected in Stage I.

Weight:   Weight variables have been calculated in order to adjust for the unequal erosion from the original sample over the period of the longitudinal study. For example, if 10 percent of students from School A drop out of or are lost to the study and 20 percent of students from School B drop out or are lost to the study, the unweighted use of the data set would produce estimates that overestimated the contribution of students from School A and underestimated the contribution of students from School B. Correct estimates of national distributions can only be obtained by using the appropriate weight variable for the analysis at hand. A new longitudinal weight was created for the merged file containing both cohorts, WGT12A, and should be used for all longitudinal analyses containing both cohorts for the high school years. More weights are available. For detailed information please see the user guide.

Mode of Data Collection:   self-enumerated questionnaire

Response Rates:   Please see the user guide for detailed response rate information.

Extent of Processing:  ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Created variable labels and/or value labels.
  • Created online analysis version with question text.
  • Checked for undocumented or out-of-range codes.


Original ICPSR Release:  

Version History:

  • 2011-04-01 PI information corrected.

Related Publications



Metadata Exports

If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.

Download Statistics