Head Start Impact Study (HSIS), 2002-2006 Resource Guide
About the Guide
This resource guide provides a brief overview of the Head Start Impact Study (HSIS), 2002-2006 and specific instructions for obtaining the restricted-use HSIS datasets. HSIS users should refer to the User Guide (found on the documentation page), which provides greater detail on the topics discussed below.
About the Data
Since its beginning in 1965 as a part of the War on Poverty, Head Start's goal has been to boost the school readiness of low income children. Based on a "whole child" model, the program provides comprehensive services that include the following: preschool education, medical care, dental care, mental health care, nutrition services, and efforts to help parents foster their child's development. Head Start services are designed to be responsive to each child's and family's ethnic, cultural, and linguistic heritage.
In the 1998 reauthorization of Head Start, Congress mandated that the United States Department of Health and Human Services determine, on a national level, the impact of Head Start on the children it serves. This legislative mandate required that the impact study address two main research questions:
- What difference does Head Start make to key outcomes of development and learning (and in particular, the multiple domains of school readiness) for low-income children? What difference does Head Start make to parental practices that contribute to children's school readiness?
- Under what circumstances does Head Start achieve the greatest impact? What works for which children? What Head Start services are most related to impact?
The Head Start Impact Study (HSIS) addresses these questions by reporting on the impacts of Head Start on children and families during the children's preschool, kindergarten, and first grade years. It was conducted with a nationally representative sample of nearly 5,000 three- and four-year-old preschool children across 84 nationally representative grantee/delegate agencies in communities where there are more eligible children and families than can be served by the program. The children participating were randomly assigned to either a treatment group (which had access to Head Start services) or a comparison group (which did not have access to Head Start services, but could receive other community resources). Data collection began in the fall of 2002 and ended in spring 2006, following children through the spring of their first grade year. Baseline data were collected through parent interviews and child assessments in fall 2002. The annual spring data collection included child assessments, parent interviews, teacher surveys, and teacher-child ratings. In addition, during the preschool years only, data collection included classroom and family day care observations, center director interviews, care provider interviews, and care provider-child ratings.
The study examined differences in outcomes in several domains related to school readiness: children's cognitive, social-emotional, and physical health, and parenting outcomes (e.g., reading to the child, use of spanking and time out, exposing children to cultural enrichment activities, safety practices, parent-child relationships, etc.). It also examined whether impacts differed based on characteristics of the children and their families including: the child's pre-academic skills at the beginning of the study, the child's primary language, whether the child has special needs, the mother's race/ethnicity, the primary caregiver's level of depressive symptoms, household risk, and urban or rural location.
It is important to note that the Head Start Impact Study (HSIS) was designed to determine whether Head Start has impacts on participating children and their parents and whether any impacts vary among different types of children and families. By 'impact' we mean a difference between the outcomes observed for Head Start participants (Head Start or treatment group) and what would have been observed for these same individuals had they not participated in Head Start (measured using a control group). The HSIS data files are best for analyzing research questions related to comparing outcomes for children relative to their participation in Head Start while other data (e.g. Head Start Family and Child Experiences Survey (FACES) Data) may better fit research questions related to variation in children within Head Start. The HSIS and FACES instruments include similar measurements, but FACES can provide more recent information.
This resource guide was prepared by Sara C. Lazaroff, ICPSR and Julia Roach, ICPSR. It was developed for the PreK-3rd Data Resource Center: The First Six Years of Schooling and Beyond, a website hosted by ICPSR with support from the Foundation for Child Development.
First-time applicants to Head Start in fall 2002 were randomly selected from a nationally representative sample of Head Start programs. The study used a multi-stage sampling process that included the following:
- Identify grantee/delegate agencies - the process began by using the Head Start Program Information Report (PIR) to create a list of Head Start grantee and delegate agencies operating in all 50 States, DC and Puerto Rico in fiscal year 1998-99.
- Create, stratify, and select geographic clusters - the programs were grouped by geographic proximity into clusters (n=161) and each cluster was then grouped into strata (n=25) to ensure variation in factors such as state pre-K and child care policy, child race/ethnicity, urban/rural location, and region.
- Determine grantee/delegate agency eligibility - To be eligible for inclusion in the study sample, grantee/delegate agencies had to have "extra" newly entering applicants beyond their number of funded slots to allow for the creation of a non-Head Start control group. That is, the programs could not be serving all the eligible children in their community who wanted Head Start, a situation we refer to as "saturation." Ethically, random assignment could only be conducted in communities where Head Start programs were expected to be unable to serve all the eligible children seeking enrollment for fall 2002.
- Stratify and select grantee/delegate agencies - under a PPS (Probability Proportional to Size) sample design, the largest programs have the highest probability of being selected so to ensure the inclusion of the full range of Head Start grantee/delegate agencies, smaller programs were combined with other agencies in the same cluster to form "grantee/delegate agency groups." The single grantee/delegate agencies, and the formed groups, were then stratified along several dimensions to ensure that programs selected represented the following conditions: urban location, auspice (school based versus all other agency types), Hispanic and African American enrollment, program options (part-day, full-day, both), and the percentage of total enrollment represented by newly entering 3-year-olds.
- Recruit grantee/delegate agencies - this resulted in 76 grantee/delegate agency groups and 87 individual grantee/delegate agencies.
- Develop list of Head Start centers - Participating grantee/delegate agencies provided lists of operating centers as of fall 2002 (n=1,427 centers).
- Determine eligible centers and create center groups - The center-level data were first used to eliminate 169 centers determined to be "saturated," as was done previously for grantee/delegate agencies. This step reduced the total eligible pool of centers from 1,427 to 1,258 across 84 separate grantee/delegate agencies in 76 grantee/delegate agency groups
- Stratify and select a sample of study centers - Stratify centers using same characteristics used with grantees. Randomly select centers and exclude saturated centers (84 grantee/delegate agencies, 383 centers).
- Select children and conduct random assignment - The Head Start grantee/delegate agencies and centers, when properly weighted, was designed to yield a sample of children that represented the national population of newly entering children and their families (with the exclusions noted above) for the 2002-03 program year. The sample of children included 2,783 Head Start children and 1,884 control children.
In total, 4,667 newly entering children were randomly assigned and included in the Head Start Impact Study. Because Puerto Rico is excluded from these restricted use files for confidentiality reasons, the total number of children included is 4,442. Table 1 provides the distribution of these children by age cohort and treatment/control status.
|Age Cohort||Head Start Group||Control Group||Total Sample|
Baseline data were collected through parent interviews and child assessments in fall 2002. Data collection included annual spring child assessments, parent interviews, teacher surveys, and teacher-child ratings. In addition, during the preschool years only, data collection included classroom/family day care observations, center director interviews, care provider interviews, and care provider-child ratings. Outcome measures were developed in four domains: child cognitive development, child social-emotional development, health, and parenting practices.
The data collection procedures and instruments are described below. Variables from all the instruments are included in the restricted use files. Derived variables referenced in the HSIS Final Report are defined in the codebooks and also are included in the restricted use file. Additional details on measures, derived variables, and the data collection year can be found in the HSIS Final Report and the HSIS Technical Report on the Office of Planning, Research and Evaluation website.
Direct Child Assessments
Child assessments are the most direct measures of the cognitive development of study children and the extent to which they are educationally ready for success in school. The child assessment battery used in the Head Start Impact Study focused on language and literacy, including children's vocabulary knowledge, reading and writing skills and achievement, oral comprehension and phonological awareness, and math skills and achievement. The cognitive test battery consists of both standardized tests developed by recognized test publishing companies and non-standardized tests developed for use in the Head Start Family and Child Experiences (FACES) project. As the children developed, new tests were added to the child assessment battery, existing tests were extended to include more difficult items, and, in some cases, preschool-level tests were dropped as the children entered elementary school.
In-person interviews were typically conducted in the home of each study child with a parent or primary caregiver living with and responsible for raising the child at the fall 2002 baseline point and at each of the subsequent spring 2003, 2004, 2005, and 2006 follow-up data collection waves. Parent interviews were available in both English and Spanish. Information collected during the parent interviews included: (1) parents' report of a variety of child-specific information, including the child's demographic characteristics, health, social-emotional ratings and behavior, developmental accomplishments, and disabilities; (2) parental characteristics such as education, employment, and reported depressive symptoms; (3) household characteristics, such as household risk, household members and income; (4) parent-child activities and interactions such as reading to the child; (5) parenting practices such as safety practices and parenting styles; (6) the child's experiences during preschool and early elementary school years, including parent communication and involvement with school; and (7) community characteristics such as crime in the neighborhood. The parent interview included versions (sometimes modified) of the Child Behavior Checklist (Achenbach, 1987), Developing Skills Checklist (1990), Child-Parent Relationship Scale (Pianta, 1992), Perlin Mastery Scale-Locus of Control (Perlin and Schooler, 1978), 20 item CES-D (Seligman, 1993), and Parenting Styles (Baumrind, 1971). Data from these instruments were collected for most data collections but not all.
Teacher Surveys and Child Ratings
Additional information was obtained from teachers and other care providers (e.g., family child care providers) who completed self-administered questionnaires to rate each of the study children who were in their classroom or care (Teacher/Care Providers' Child Reports). Teachers also completed questionnaires, and care providers were interviewed in person, to obtain information about them, the nature of the setting in which they worked, and the types of services they provided to the selected study children.
Head Start and Elementary School Experiences
Information was obtained on the experiences of children and the services they received during their preschool years (when they were in Head Start or other child care environments), as well as during their kindergarten and 1st grade years. For the preschool year, in-person interviews were conducted with directors of the Head Start and non-Head Start centers that study children attended. To further measure quality of care, direct observations of classrooms and family day care homes were conducted. The teacher survey (described above) also provided information on the teacher qualifications and classroom environments that children attended. Direct observations of care setting and quality were used for children in center-based and family day care home programs, including those participating in Head Start. Only classrooms or family day care homes with study children enrolled were observed. Care setting observations were conducted during the years before the children entered kindergarten. These observational tools provide direct measures of the extent to which study children in Head Start classrooms and other child care programs have skilled teachers who provide developmentally appropriate environments and curricula for these children. Classroom quality was measured, using the Early Childhood Environmental Rating Scale-Revised (ECERS-R) (Harms et al, 1998) for children who were in centers and the Family Day Care Rating Scale (FDCRS) (Harms and Clifford, 1989) for children who were in family day care homes. In addition, the classroom/family day care home observers completed the Arnett Caregiver Interaction Scale (Arnett, 1989) to measure teacher/care giver traits and interactions such as greater teacher sensitivity and encouragement of children's independence.
Covariates and Subgroups
To add the explanatory power of child and family background factors to the analysis reported in the HSIS Final Report, key demographic variables measured in fall 2002 were included as covariates. All of the reported HSIS analyses included the same set of demographic variables as covariates, irrespective of the age cohort, outcome, and follow-up year. The same set of covariates was also used for every subgroup analysis. The selected variables met two criteria: (1) they likely correlate with child and family outcomes (and thereby help to increase the explanatory power of the model), and (2) they could not have been influenced by Head Start during the first weeks of participation (i.e., prior to the time they were measured).
The HSIS restricted use data files are structured as 28 data files (26 instrument files, one covariates and subgroup variables file, and one child experiences file), plus one weights file and one JKN (jackknife) factors file. All the files are child level and include both program and center identifiers. There is one cross-sectional data file per instrument for each data collection period for fall 2002 through spring 2006 with two supplemental data files that contain derived variables from multiple instruments (covariates and subgroup variables file and child experiences file). All files include both cohorts of children in a single data file. The variable 'COHORT' has been included on every file to allow separate analyses by age cohort.
Each data file includes the following core variables (with the exception of the weights file which includes only the Child ID and Child cohort):
- HSIS_RAPROGID: Random assignment program ID
- HSIS_RACNTRID: Random assignment center ID
- HSIS_CHILDID: Child ID
- CHILDCOHORT: Child cohort status which designates whether the child is in the 3-year-old or 4-year-old cohort
- CHILDRESULTGROUP: Treatment/control status which designates whether the child is in the Head Start or control group
- CROSSOVER: Crossover status which designates a child who was assigned as a control but participated in Head Start
- NOSHOW: No Show status which designates a child who was assigned to Head Start but did not participate in Head Start
Each instrument file includes the source variables from the assessment, survey, or interview followed by the derived variables relating to the assessment, survey, or interview. In addition, center or school IDs and subsetting or classroom IDs are included in the files for the Other Care Provider, Teacher Survey, Teacher/Care Provider-Child Report (TCR), Family Day Care Observation, Classroom Observation and Child Experiences. Center IDs are included in the Center Director files. The instrument based data files contain both source (non-copyrighted items only) as well as variables derived solely from that instrument. They are described in Table 2.
|Data Collection Period||Instrument||Level||Grade||Cohort||Observations||Variables|
|Fall 2002||Child Assessment||Child||Head Start & Non-Head Start||3/4||4,442||107|
|Spring 2003||Parent Interview||Child||Head Start & Non-Head Start||3/4||4,442||590|
|Covariates and Subgroup Variables||Child||Head Start & Non-Head Start||3/4||4,442||24|
|Care Provider||Child||Head Start & Non-Head Start||3/4||122||214|
|Spring 2004||Center Director||Child||Head Start & Non-Head Start||3/4||2,380||261|
|Child Assessment||Child||Head Start & Non-Head Start||3/4||4,442||105|
|Classroom Observation||Child||Head Start & Non-Head Start||3/4||2,383||152|
|Family Childcare Observation||Child||Head Start & Non-Head Start||3/4||47||92|
|Parent Interview||Child||Head Start & Non-Head Start||3/4||4,442||549|
|Teacher Survey||Child||Head Start & Non-Head Start||3/4||2,428||195|
|Teacher's Child Report||Child||Head Start & Non-Head Start||3/4||2,539||38|
|Child Experiences (2003 and 2004)||Child||Head Start & Non-Head Start||3/4||4,442||73|
|Care Provider||Child||Head Start/ Kindergarten||3/4||16||214|
|Spring 2005||Center Director||Child||Start & Non-Head Start/ Kindergarten||3/4||1,521||260|
|Child Assessment||Child||Head Start & Non-Head Start/ Kindergarten||3/4||4,442||123|
|Classroom Observation||Child||Head Start & Non-Head Start/ Kindergarten||3/4||1,651||152|
|Family Childcare Observation||Child||Head Start & Non-Head Start/ Kindergarten||3/4||10||92|
|Parent Interview||Child||Head Start & Non-Head Start Kindergarten||3/4||4,442||641|
|Teacher Survey||Child||Head Start & Non-Head Start/ Kindergarten||3/4||2,760||254|
|Teacher's Child Report||Child||Head Start & Non-Head Start/ Kindergarten||3/4||2,780||80|
|Child Assessment||Child||Kindergarten/1st grade||3/4||4,442||141|
|Spring 2006||Parent Interview||Child||Kindergarten/1st grade||3/4||4,442||491|
|Teacher Survey||Child||Kindergarten/1st grade||3/4||3,028||201|
|Teacher's Child Report||Child||Kindergarten/1st grade||3/4||3,044||95|
|Child Assessment||Child||1st grade||3||2,449||132|
|Parent Interview||Child||1st grade||3||2,449||436|
|Teacher Survey||Child||1st grade||3||1,760||201|
|Teacher's Child Report||Child||1st grade||3||1,772||95|
Sampling weights were calculated for each child to allow estimates based on the sample to represent the national population of newly entering Head Start participants for 2002. Because children were randomly assigned to Head Start (i.e., the "program or Head Start" group) and non-Head Start (i.e., the "control" group) groups within each Head Start center, the two groups represent the same Head Start population of newly entering children when appropriately weighted. The only difference, theoretically, is that the Head Start group was allowed access to attend Head Start at the time of random assignment, while the control group was not. Each study child was assigned a base weight that reflected his/her overall probability of selection, including the sampling of broad geographic areas used as primary sampling units (PSUs), Head Start grantees/delegate agencies, and centers. These base weights were then adjusted for non-response to the child assessment and parent interview at each wave of data collection, to produce separate fall 2002, spring 2003, spring 2004, spring 2005, and spring 2006 weights.
The non-response-adjusted weights of children in the 4-year-old group were post-stratified to the Head Start National Reporting System (HSNRS) newly entering enrollment totals for 4-year-olds (comparable totals for 3-year-olds were not available). Extremely large weights were then trimmed for both age groups. The final child and parent weights are the product of the overall base weight, a non-response adjustment factor, a post-stratification fact and a trimming factor. For variance estimation, a set of 76 jackknife replicate weights was created for each child weight.
Due to the complexity of the survey data, replication methods and weights must be used to estimate standard errors. The HSIS weights file includes a set of 76 jackknife replicate weights created for each child for use in the calculation of all standard errors. The jackknife replicate weights must be used in conjunction with the "JKN" factors that are also provided with the data to obtain the correct standard error estimates. The easiest way for analysts to incorporate the weights and correct standard errors into their analyses is to use software designed for analysis of complex survey data such as WesVar, SUDAAN, Stata, or the SAS survey procedures.
The HSIS data files contain the following weights and replicate weights:
- Program weights
- Center weights
- Child level base weights and their replicates
- Cross-sectional weights:
- Fall 2002 Cross-sectional child assessment weights and their replicates
- Fall 2002 Cross-sectional parent interview weights and their replicates
- Spring 2003 Cross-sectional child assessment weights and their replicates
- Spring 2003 Cross-sectional parent interview weights and their replicates
- Spring 2003 Cross-sectional center director interview weights and their replicates
- Spring 2003 Cross-sectional teacher weights and their replicates
- Spring 2003 Cross-sectional classroom observation weights and their replicates
- Spring 2004 Cross-sectional child assessment weights and their replicates
- Spring 2004 Cross-sectional parent interview weights and their replicates
- Spring 2004 Cross-sectional center director interview weights and their replicates
- Spring 2004 Cross-sectional teacher weights and their replicates
- Spring 2004 Cross-sectional classroom weights and their replicates
- Spring 2005 Cross-sectional child assessment weights and their replicates
- Spring 2005 Cross-sectional parent interview weights and their replicates
- Spring 2005 Cross-sectional teacher weights and their replicates
- Spring 2006 Cross-sectional child assessment weights
- Spring 2006 Cross-sectional parent interview weights and their replicates
- Spring 2006 Cross-sectional teacher weights and their replicates
- Longitudinal weights (children with measures at 2 or more points in time) and their replicates
- Longitudinal weights (children with measures at 3 or more points in time) and their replicates
- Longitudinal weights (children with a teacher survey and teacher-child rating in kindergarten and 1st grade) and their replicates
Careful consideration should be given to the choice of a weight for a specific analysis since it depends on the type of data analyzed. Each set of weights is appropriate for a different set of data or combination of sets of data. Due to non-response adjustments, sometimes the decision of which weight to use is affected by how much of the sample will be excluded if one weight were selected over the other. A more detailed description of the weights is available in the User Guide.
How to Obtain Data and
Downloading Data and Documentation from the PreK-3rd Data Resource Center
Data from the Head Start Impact Study (HSIS), 2002-2006 are made available through the Child Care and Early Education Research Connections project, a data archive within ICPSR.
Researchers interested in downloading analysis ready data and documentation files can do so free of charge through the PreK-3rd Data Resource Center's Datasets & Resource Guide web page. All HSIS datasets are restricted and accessing these files requires a signed User Agreement (PDF 430K). Each data file and the weights file is provided in ASCII (text) files with SAS and SPSS setup (syntax) files. All documentation is provided in PDF format, and each codebook is also available in text and html formats.
To download the HSIS data and/or documentation, researchers must agree to the terms and conditions of use. To download files, select Datasets & Resource Guides from the menu options located in top right corner of the PreK-3rd Data Resource Center home page. Locate the Access Data link for the Head Start Impact Study (HSIS), 2002-2006. Clicking on the link will redirect you to the Research Connections website and the Resource Page where the user can choose to apply for the restricted data files, download or browse documentation files, etc. Before downloading the data or beginning analysis, it is important for the user to become familiar with the User Guide and Questionnaires.