This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.

Longitudinal Study of American Youth, 1987-1994, 2007-2009 (ICPSR 30263) RSS

Principal Investigator(s):


The Longitudinal Study of American Youth (LSAY) is a project that was funded by the National Science Foundation in 1985 and was designed to examine the development of: (1) student attitudes toward and achievement in science, (2) student attitudes toward and achievement in mathematics, and (3) student interest in and plans for a career in science, mathematics, or engineering, during middle school, high school, and the first four years post-high school. The relative influence parents, home, teachers, school, peers, media, and selected informal learning experiences had on these developmental patterns was considered as well.

The older LSAY cohort, Cohort One, consisted of a national sample of 2,829 tenth-grade students in public high schools throughout the United States. These students were followed for an initial period of seven years, ending four years after high school in 1994. Cohort Two, consisted of a national sample of 3,116 seventh-grade students in public schools that served as feeder schools to the same high schools in which the older cohort was enrolled. These students were followed for an initial period of seven years, concluding with a telephone interview approximately one year after the end of high school in 1994.

Beginning in the fall of 1987, the LSAY collected a wide array of information including: (1) a science achievement test and a mathematics achievement test each fall, (2) an attitudinal and experience questionnaire at the beginning and end of each school year, (3) reports about education and experience from all science and math teachers in each school, (4) reports on classroom practice by each science and math teacher serving a LSAY student, (5) an annual 25-minute telephone interview with one parent of each student, and (6) extensive school-level information from the principal of each study school.

In 2006, the NSF funded a proposal to re-contact the original LSAY students (now in their mid-30's) to resume data collection to determine their educational and occupational outcomes. Through an extensive tracking activity which involved: (1) online tracking, (2) newsletter mailing, (3) calls to parents and other relatives, (4) use of alternative online search methods, and (5) questionnaire mailing, more than 95 percent of the original sample of 5,945 LSAY students were located or accounted for. In addition to re-contacting the students, the proposal defined a new eligible sample of approximately 5,000 students. These young adults were asked to complete a survey in 2007, 2008, and 2009.

The public release data files include information collected from the national probability sample students, their parents, and the science and mathematics teachers in the students' schools. The data covers the initial seven years, beginning in the fall of 1987, as well as the data collected in the 2007, 2008, and 2009 questionnaires.

Part 1: LSAY Merged Cohort (Base File) contains student and parent data from both cohorts of the LSAY from 1987-1994 and student follow-up data from 2007-2009. Additionally, Parts 2 - 5 contain information gathered from two teacher background questionnaires and two principal questionnaires from 1987-1994.

Access Notes

  • These data are available only to users at ICPSR member institutions. Because you are not logged in, we cannot verify that you will be able to download the data.


WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.

DS0:  Study-Level Files
DS1:  LSAY Merged Cohort File (Base File) - Download All Files (863,452 KB)
DS2:  Teacher Background Questionnaires Years 1-5 - Download All Files (63,639 KB)
DS3:  Teacher Background Questionnaires Year 6 - Download All Files (59,769 KB)
DS4:  Principal Questionnaire Fall 89 - Download All Files (58,001 KB)
DS5:  Principal Questionnaire Fall 93 - Download All Files (57,711 KB)

Study Description


Miller, Jon D. Longitudinal Study of American Youth, 1987-1994, 2007-2009. ICPSR30263-v2. Ann Arbor, MI: Inter-university Consortium for Political and Social Research[distributor], 2014-04-24. doi:10.3886/ICPSR30263.v2

Persistent URL:

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote XML (EndNote X4.0.1 or higher)


This study was funded by:

  • National Science Foundation (REC-0337487, MDR-8550085, REC96-27669, DUE-0856695, DRL-0917535, RED-9909569, DUE-0525357)

Scope of Study

Subject Terms:   academic achievement, achievement tests, aptitude, engineering industry, high school students, junior high school students, mathematics, parents, postsecondary education, public schools, school age children, school principals, schools, science, science education, secondary education, student attitudes, teacher attitudes, teacher morale, teachers, teaching (occupation)

Smallest Geographic Unit:   region

Geographic Coverage:   United States

Time Period:  

  • 1987--1994
  • 2007--2009

Date of Collection:  

  • 1987--1994
  • 2007--2009

Unit of Observation:   individual

Universe:   7th and 10th grade students in public schools in the United States in 1987 as well as those same students that could be recontacted again in 2007, 2008, 2009 with a follow-up questionnaire.

Data Types:   survey data

Data Collection Notes:

ICPSR created a unique sequential record identifier variable named CASEID for use with online analysis.

The original two-cohort, two-file data structure reflected the initial period of data collection, but it was awkward for users that wanted to compare the two cohorts or to combine them for various analyses. The merged data file includes a variable to indicate the original cohort, allowing a user to repeat or extend any analysis conducted with the previous LSAY release file, but the naming of the variables in the merged file has been revised to correct dual or conflicting variable names and indicators. The new merged file structure will facilitate the annual release of new cycles of data collection through the addition of variables to the base system.

For further information about LSAY see the Longitudinal Study of American Youth Website


Study Design:   The LSAY sample design consisted of a sample from high schools and a sample of middle or junior high schools that sent students to the participating high schools. Selection of the latter set of schools was accomplished by obtaining information from high school officials on feeder patterns to their schools. Many of the sampled high schools were served by only one feeder school, and nine selections included the middle school grade levels included in the participating high school. A number of the high schools however received students from two or more feeder schools, and in these cases one feeder school had to be selected. The selection procedure involved calculating the proportion of students in the high school who came from each feeder school and then randomly selecting one feeder school, where the probability of selection was proportional to the feeders' contributions to the high school's enrollment. In the event that a school or district declined to participate in the LSAY, a school of similar size and zip code indicating proximity to the original selection was chosen. Once a school's cooperation was secured, the LSAY obtained a complete student roster for the seventh and tenth grade cohorts. To provide a sufficient number of students in each school to compute school effects in subsequent analyses, a sample of 60 students was selected from each school. Students were selected randomly from the lists and asked to participate until the target response size was achieved. In some schools with fewer than 60 students in their seventh or tenth grade classes, all students were selected for participation. When a student refused to participate, the school research coordinator was directed to draw a replacement from an additional list of students, starting at the beginning of the alternate list and proceeding sequentially until a participant was secured. The alternate list was selected randomly, using the same procedures outlined above in constructing the original sample. The LSAY fielded over 40 instruments for Cohort Two and 26 for Cohort One from October 1987 through June 1994. Resumption of LSAY tracking activities began in April, 2006 and re-entry questionnaires were administered in 2007, 2008, and 2009. For more information on Study Design, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Sample:   The sampling scheme for the base year of the LSAY was a two-stage stratified probability sample. The United States was stratified by four geographic regions and by three levels of urban development (central city, suburban, and nonmetropolitan) to produce a total of 12 strata. Stage I involved the selection of schools to participate in the study. Stage II was the random selection of 60 students within each school selected in Stage I. Resumption of LSAY tracking activities began in April, 2006 and re-entry questionnaires were administered in 2007, 2008, and 2009. For more information on Sampling, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Weight:   The data are not weighted. There are many weights present in Part 1: LSAY Merged Cohort (Base File). Weight variables have been calculated in order to adjust for the unequal erosion from the original sample over the period of the longitudinal study. For example, if 10 percent of students from School A drop out of or are lost to the study and 20 percent of students from School B drop out or are lost to the study, the unweighted use of the dataset would produce estimates that overestimated the contribution of students from School A and underestimated the contribution of students from School B. Correct estimates of national distributions can only be obtained by using the appropriate weight variable for the analysis at hand. A new longitudinal weight was created for the merged file containing both cohorts, WGT12A, and should be used for all longitudinal analyses containing both cohorts for the high school years. In addition, please refer to the Original P.I. Documentation in the ICPSR User Guide for a description of all weights that are present in the data collection.

Mode of Data Collection:   cognitive assessment test, mail questionnaire, mixed mode, on-site questionnaire, telephone interview, web-based survey

Response Rates:   For more information on Response Rates, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Presence of Common Scales:   For more information on Scales, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Extent of Processing:  ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Standardized missing values.
  • Checked for undocumented or out-of-range codes.


Original ICPSR Release:  

Version History:

  • 2014-04-24 This is an update to LSAY data (ICPSR 30263). Part 1: LSAY Merged Cohort File (Base File) includes the following: (1) data collected in the 2008 and 2009 questionnaires (this data was not available in the previous release), (2) all data collected in the 2007 questionnaire which includes additional cases not available in the earlier release, as well as corrections and clarifications to some cases, and (3) all constructed student and parent variables from 1987-1994. The 2007 data and constructed variables that were previously included in ICPSR (30263) were replaced with Part 1: LSAY Merged Cohort (Base File). Parts 2 - 5 include data collected from 1987-1994. The data files are identical to the previously released files. In addition, R data files have been added for Parts 2 - 5.
  • 2011-04-01 PI information corrected.

Related Publications



Metadata Exports

If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.

Download Statistics

Found a problem? Use our Report Problem form to let us know.