parent child relationship,
school age children,
- 1987--1991 (6 Month Interval Interviews [Spring and Fall])
- 1991--2001 (Once yearly interviews)
- 2006--2007 (Data currently not available)
- 2009--2010 (Data currently not available)
Date of Collection:
- 1987 (Spring--Screening [Cohort 1])
- 1987 (Fall--Phase A [Cohort 1])
- 1988 (Spring--Phase B [Cohort 1], Screening [Cohort 2])
- 1988 (Fall--Phase C [Cohort 1], Phase A [Cohort 2])
- 1989 (Spring--Phase D [Cohort 1], Phase B [Cohort 2])
- 1989 (Fall--Phase E [Cohort 1], Phase C [Cohort 2])
- 1990 (Spring--Phase F [Cohort 1], Phase D [Cohort 2])
- 1990 (Fall--Phase G [Cohort 1], Phase E [Cohort 2])
- 1991 (Spring--Phase H [Cohort 1], Phase F [Cohort 2])
- 1991 (Fall--Phase G [Cohort 2])
- 1992 (Spring--Phase J [Cohort 1], Phase H [Cohort 2])
- 1993 (Spring--Phase L [Cohort 1], Phase J [Cohort 2])
- 1994 (Spring--Phase N [Cohort 1], Phase L [Cohort 2])
- 1995 (Spring--Phase P [Cohort 1], Phase N [Cohort 2])
- 1996 (Spring--Phase R [Cohort 1], Phase P [Cohort 2])
- 1997 (Spring--Phase T [Cohort 1], Phase R [Cohort 2])
- 1998 (Spring--Phase V [Cohort 1], Phase T [Cohort 2])
- 1999 (Spring--Phase Y [Cohort 1], Phase V [Cohort 2])
- 2000 (Spring--Phase AA [Cohort 1], Phase Y [Cohort 2])
- 2001 (Spring--Phase AA [Cohort 2])
- 2006--2007 (Fall--Phase CC [Cohort 1&Cohort 2])
- 2009--2010 (Fall--Phase DD [Cohort 1&Cohort 2])
Unit of Observation:
The population of 1st, 4th, and 7th grade students and their parents attending public school in Pittsburgh, Pennsylvania, during the 1987-1988.
This study collection contains only those students, and their parent, who were in 1st grade during the 1987-1988 school year.
Data Collection Notes:
Most variables in the PYS data collection match the questions and labeling information presented in the original survey and interview booklets provided by the research team. However, users should be aware that there are some minor discrepancies between the raw data available as part of this collection and the original survey and interview booklets.
A limited number of questions that appear in the original booklets are not represented by corresponding variables in the data. Variables were dropped from the data collection for various reasons including confidentiality concerns, the presence of duplicate questions in the original questionnaire, and missing unit information that made particular questions/variables meaningless. Additionally, some free-response questions were never coded or entered into the data files.
Variable labels were written with the intention of conveying exactly what is measured by that individual variable, which in some cases requires including some of the text of other questions. For some variables, the answer codes printed in the questionnaire were inadequate for representing the responses given, so additional codes were added in the data file.
When working with the PYS data, please use Cohort 2 booklets unless the question is labeled "Cohort 1 only." Approximately two-thirds of the participants in PYS are part of Cohort 2. Therefore, the PYS standard is to use Cohort 2 variable names and data structure as much as possible. Consequently, cohort 2 booklets match best with the available datasets.
Open-ended questions have been recoded to categorical variables to facilitate analyses and to protect respondent confidentiality. A detailed inventory of variable-level recodes is not available.
Single questions in which a respondent could circle or select more than one response are represented by multiple dichotomous variables in the data.
In continuous variables, the PYS research team and ICPSR retained all responses even if a value seems implausible. Users should review continuous variables for high and low outliers before including a variable in their analyses. Researchers may consider top- or bottom-coding responses or re-coding outliers to missing data codes.
In some datasets, there are differences in the numbering and lettering of questions between the data and documentation. Please note that even though differences may exist, the text of the questions present are largely identical between the data and documentation. Therefore, these numbering or lettering differences should not significantly impede users attempts to match questions in the data with the documentation provided.
There may be differences between early versions of the questionnaire booklets and the data in the way answers are coded. Changes were made to coding in the data to make it consistent with questionnaires from subsequent phases.
Please see the readme file for a description of the public use documentation and the 636 datasets in this collection.
The following phases from the youngest sample are included in this collection:
- Screening (1987/1988)
- Phases A - H (1987/1988 - 1991/1992)
- Phase J (1992/1993)
- Phase L (1993/1994)
- Phase N (1994/1995)
- Phase P (1995/1996)
- Phase R (1996/1997)
- Phase T (1997/1998)
- Phase V (1998/1999)
- Phase Y (1999/2000)
- Phase AA (2000/2001)
The following phases from the youngest sample are not included in this collection:
- Phase CC (2006/2007)
- Phase DD (2009/2010)
The Pittsburgh Youth Study (PYS) is a part of the Program of Research on the Causes and Correlates of Delinquency (Causes and Correlates), initiated in 1986 by the United States Office of Juvenile Justice and Delinquency Prevention. Causes and Correlates is designed to improve the understanding of serious delinquency, violence, and drug use by examining how youth develop within the context of family, school, peers, and community. Specifically, PYS aims to document the development of antisocial and delinquent behavior from childhood to early adulthood, the risk factors that impinge on that development, and help seeking and service provision of boys' behavior problems. It also focuses on boys' development of alcohol and drug use, and internalizing problems. Additionally, the study serves as a real-life laboratory for advancing and testing hypothesized developmental pathways.
The initial sample for the Pittsburgh Youth Study (PYS) was selected with the assistance of the Pittsburgh Board of Education in 1987. PYS researchers started out with comprehensive public school lists of the enrollment of 1,631, 1,432, and 1,419 male students in grades 1, 4, and 7 during the 1987-1988 school year respectively. From these lists, researchers randomly selected about 1,100 boys in each of the three grades to be contacted (1,165, 1,146, and 1,125 in grades 1, 4, and 7, respectively). However, a number of the children had moved out of the school district, proved to be girls, or were of incorrect age and were therefore not eligible participants. Eventually, 1,006, 1,004, and 998 families with eligible boys in grades 1, 4, and 7, respectively, were contacted. Boys in grade 1 became the "youngest" sample, boys in grade 4 became the "middle" sample, and boys in grade 7 became the "oldest" sample. From this contact, 84.6 percent, 86.3 percent, and 83.9 percent of the eligible boys in the youngest, middle, and oldest samples respectively chose to participate in PYS.
In order to increase the number of high-risk males in the sample, researchers used a screening assessment on a subset of the boys during the first phase of the study, Phase S. Risk scores from this screening assessment measured each boy's antisocial behavior using parent, teacher, and self-report instruments. Within each grade-based sample, boys identified at the top 30 percent on the screening risk measure (n=~250), as well as an equal number of boys randomly selected from the remaining 70 percent (n=~250), were selected for follow-up in subsequent phases (Phase A- Phase DD). This resulted in the final samples of 503, 508, and 506 boys in grades 1, 4, and 7, respectively, who together with their parent were to be followed up.
The youngest sample (N=503) and the oldest sample (N=506) have been assessed continuously since 1987, while the middle sample (N=508) was only assessed seven times from ages 10-13. Assessments of each of the cohorts were carried out initially half-yearly, and later yearly. When the assessment periods switched from six months to one year, the youngest sample was interviewed every spring and the oldest sample every fall. Each phase letter still represents a six-month period. Thus, all the phases from H through AA have data for only one sample.
Mode of Data Collection:
paper and pencil interview (PAPI),
In-person and telephone structured interviews of student participants and one parental figure
Self-Report and mail questionnaires completed by student participants and one parental figure
Description of Variables:
There are 636 datasets present in the Pittsburgh Youth Study (PYS). Each dataset includes variables relating to a particular topic and collected from the student, their caretaker and their teacher. The interviewers themselves also provided appraisals of the student's neighborhood conditions. The topics covered were as follows:
- Demographic Information
- Extent and Consequences of Participant Drug Use and Delinquency
- Peer Drug Use
- Peer Delinquency
- Attitudes Towards Delinquency
- Family Delinquency
- Level of Supervision
- Health History
- Participant Gang Involvement
- Crime Victimization
- Neighborhood Characteristics
- Romantic Relationships and Sexual Experiences
- Caretaker Stress Level
- Employment Status and Skills
- Parenting Style and Involvement
- Attitudes Towards Education
- Parent-Child Relationship and Communication
- Personal Characteristics
- Parental Expectations
- Methods of Discipline
- Household Financial Information
- Insurance Information
Participant retention for the Pittsburgh Youth Study has historically been high (mean=91 percent), with 82 percent of living participants completing the most recent interview conducted in 2010.
Extent of Processing: ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of
disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major
statistical software formats as well as standard codebooks to accompany the data. In addition to
these procedures, ICPSR performed the following processing steps for this data collection:
Created variable labels and/or value labels.
Standardized missing values.
Checked for undocumented or out-of-range codes.