PreK=-3rd Data Resource Center: The First Six Years of Schooling and Beyond

National Head Start/Public School Early Childhood Transition Demonstration Study Resource Guide

Introduction

About the Guide

This resource guide provides a brief overview of the National Head Start/Public School Early Childhood Transition Demonstration Study, 1991-1999, as well as instructions for obtaining a copy of the data and for linking the component datasets.

About the Data

The Head Start National Head Start/Public School Early Childhood Transition Demonstration Study, 1991-1999 (NTDS) is a U.S. Department of Health and Human Services sponsored study, which collects data on the process and outcomes of the National Head Start/Public School Early Childhood Transition Demonstration Project. The purpose of the project is to assist children in successfully transitioning from Head Start programs to the public school system. Head Start, initiated in 1965, provides services to low income children with the goal of mitigating disadvantage in early learning experiences. Head Start provides services in the areas of education, health, social services, and parental involvement to children across the United States. In 1991, Congress responded to concerns about children's retention of skills and resources post-Head Start by legislating an extension program known as the Transition Demonstration program. This new program is intended to improve continuity between Head Start and the public school curriculum by providing services in the areas of health, mental health, parenting education, and other social services.

The Transition Demonstration Project (TDP) was implemented at 31 local sites chosen through a competitive grant process. Each of these 31 programs emerged from a national study known as the National Head Start/Public School Early Childhood Transition Demonstration Study, 1991-1999 which evaluated the impact of the programs on children and families. The TDP Study provided a set of guidelines or recommended survey components. However, each site was directed to implement these guidelines in a manner consistent with local needs and strengths. The TDP Study was aimed at addressing and measuring a series of questions related to the efficacy of Transition programs, including:

  • How have Transition Demonstration Programs been implemented at the local level?

  • What are the barriers to implementing these programs?

  • What institutional and systematic changes are evident at local sites as a result of Transition programs?

  • Do children and families in the Transition Demonstration group display better adjustment to school and improved functioning relative to comparison groups that do not receive services?

  • Which families and children experience more difficulty transitioning from Head Start to the public school system?

This resource guide provides an introductory description of the National Head Start/Public School Early Childhood Transition Demonstration Study datasets, which collect information from the 31 TDP sites. This information is drawn from the National Head Start/Public School Early Childhood Transition Demonstration Study Project Summary and Description of Data.

Acknowledgements

This resource guide was prepared by Donald J. Hernandez, Department of Sociology, University at Albany, State University of New York. It was developed for the PreK-3rd Data Resource Center: The First Six Years of Schooling and Beyond, a Web site hosted by ICPSR with support from the Foundation for Child Development.

Sample

Respondents for the National Head Start/Public School Early Childhood Transition Demonstration Study, 1991-1999 were selected for inclusion by staff at the 31 program sites. Study sites are located in rural, town, suburban and urban locations in 30 different states, with one located within the Navajo Nation. To construct the study sample, each site identified two clusters of elementary school units with Head Start programs. These two clusters had to be similar in terms of the children served and the services provided. One cluster was then randomly selected to be the Demonstration (treatment) group that received enhanced transition services under the Transition Demonstration Program while the other Comparison (control) group received standard services. Thus, children and families were assigned to the Demonstration and Comparison groups according to which school they attended at the beginning of the study period.

Two cohorts of children and their families were recruited to participate in the study. The first cohort included 3,540 children who previously attended a Head Start program and entered kindergarten in the fall of 1992. The second group (Cohort 2) is composed of 3,975 children who attended Head Start and entered kindergarten in 1993. In addition to these two primary cohorts, 22 of the 31 TDP sites chose to enroll a group of students and families who did not attend Head Start to serve as an additional reference group. The following table provides sample sizes for each of the study groups:

Table 1: Size of Demonstration and Comparison Samples*

Former Head StartNon-Head Start
DemonstrationComparisonDemonstrationComparison
Cohort 11,8891,651869828
Cohort 22,0391,936863754
Total3,9283,5871,7321,582

* Figures taken from TDP Study Project Summary.

As the TDP Study is a longitudinal project of data collection occurring during four school years, participants are considered to have fully participated if data was available for the child or family in each of the four data collection school years. Participants who drop out of the study, either because they change schools or refuse continued participation, remain in the dataset for the years for which data was successfully collected from them. All records are scored according to the number of years of program participation. Children who participate in the TDP receive one point for each year they are exposed to this treatment intervention. Children who receive four years of Transition services receive a score of four (4), while children in the comparison group who remain in that group throughout the data collection period receive a score of zero. This procedure was used to develop the information presented in Table 2.

Sixty percent of families in the Demonstration Group participated in the transition program for all four years, while less than 6% of Comparison Group members also participated in the transition program for two or more years. The following table displays the percent of participants receiving specified years of transition services, by group designation.

Table 2: Percent of Participants Receiving Services, by Group Membership*

Treatment years = 0 Treatment years = 1 Treatment years = 2 Treatment years = 3 Treatment years = 4
Demonstration group 013.111.715.459.8
Comparison group87.2 6.5 3.2 2.5 .6

* Obtained from TDP Study Project Summary.

Data Elements

Data Collection

The Transition Demonstration Program is premised on the idea that children's successful transitions to school are dependent upon a wide variety of individual, family, and community factors and characteristics. These include academic and work skills, health and nutrition, social supports, motivation, and values. The TDP Study is designed to capture information about these varied factors and processes in order to determine their influence on children's success in kindergarten through third grade. The study relies on several data sources. Data was collected for each cohort from each of the following sources at five specific points in time between fall 1992 and spring 1997:

  • Interviews with family members

  • Direct child assessments

  • Standardized teacher ratings of individual children, classrooms, and schools

  • Reports by principals

  • Classroom observations

  • Children's school records

  • Community-level characteristics from public information sources

As noted previously, the study sample consists of two cohorts of children; Cohort 1 refers to children who entered kindergarten in 1992, while Cohort 2 refers to children entering kindergarten in 1993. Cohort 1 data collection occurred at five points, the fall and spring of their kindergarten year (1992 and 1993), and then in the spring of their first, second and third grade years (1994, 1995 and 1996). Data collection for the second cohort took place at the same points in their educational process, and occurred in the fall of 1993 and the spring of 1994, 1995, 1996, and 1997. Kindergarten data is available in a single, combined file that includes the earliest reported information from each informant.

Data Structure

Data elements in the National Head Start/Public School Early Childhood Transition Demonstration Study, 1991-1999 include a wide variety of indicators of child development, including direct and indirect assessments of children's cognitive, emotional, and social development, socio-demographic characteristics of the child and immediate family members, as well as neighborhood, classroom, and school characteristics. Data elements are divided into six categories based on the source of the information. The table below presents the major data categories or procedures, organized by informant, for each of the five data collection periods for Cohort 1 (Data collection for Cohort 2 is shifted forward by one year).

Table 3: Primary Data Categories by Data Source*

Data Source CategoryTopicFall 1992Spring 1993Spring 1994Spring 1995Spring 1996
CHILDPeabody Picture Vocabulary TestXXXXX
Woodcock-Johnson Achievement TestXXXXX
What I Think of SchoolXXXX
Writing SampleXX
FAMILYGetting to Know Your FamilyX
Family Background InterviewXXXXX
Family Resource ScaleXX
Family Routines QuestionnaireXX
Primary Caregiver Health: DepressionXXX
Social Skills Rating SystemXXXXX
Your Child's Health and SafetyXX
Parenting DimensionsXXX
School Climate SurveyXXXX
Neighborhood ScalesXXX
Your Child's Adjustment to SchoolXXXX
Family Involvement in Children's LearningXX
TEACHERChild Health QuestionnaireXXXX
School Climate SurveyXXXX
Social Skills Rating SystemXXXXX
School Survey of Early Childhood ProgramsXXXX
PRINCIPALSchool Climate SurveyXXXX
School Survey of Early Childhood ProgramsXXXX
EXISTING RECORDSSchool Archival Records SearchXXXX
CLASSROOM OBSERVATIONAssessment Profile for Early Childhood ProgramsXXXX
ADAPT classroom practices assessmentXX

* Obtained from TDP Study Project Summary.

The National Head Start/Public School Early Childhood Transition Demonstration Study consists of 23 individual datasets. A description of each data category is provided below.

Child/Family Unit Datasets

There are six child/family unit datasets. Three datasets exist for kindergarten, corresponding to the fall and spring data collection periods as well as a combined dataset. The remaining three data files correspond to the first, second and third grade collections. Data for the two cohorts (1992, 1993) has been combined in each of these files.

In the six child/family unit datasets the child/family is the unit of analysis. Each child/family unit is listed on a separate line in the dataset and variables are considered attributes of those units. A child/family record is created when one or more of the following collection instruments have been completed: family interview, record of child test scores, school archival record, or teacher questionnaire. Complete information for the child is not required in order for that child to appear in the dataset as a valid entry.

Child/Family datasets are listed on the NTDS CD with the following file prefixes, followed by the data file type suffix (.sas7bdat for SAS and .sav for SPSS):

KGEARLY
KGLATE
KGCOMBIN
FIRST
SECOND
THIRD

In addition, a single, rectangular dataset consisting of the following child/family units: kindergarten (combined), first grade, second grade and third grade data is provided. This dataset is recommended for users interested in longitudinal analysis of child/family data or for researchers interested only in child/family data. The dataset contains 10,829 child/family cases. Variable names begin with the prefixes KN_, FG_, SG_, and TG_, in order to distinguish variables pertaining to the kindergarten, first grade, second grade, and third grade observation periods. The full name of each variable consists of one of these prefixes followed by the variable name of the corresponding variable, as designated in the original dataset from which it was obtained. The only exception is that the variable name NEWID has not been changed by the addition of a prefix, because this variable uniquely identifies a child/family and therefore does not change values across datasets or data collection periods.

School Unit Datasets

School Unit data is organized into five datasets, each corresponding to a specific data collection point (fall kindergarten, spring kindergarten, spring first grade, spring second grade, or spring third grade). A unique school identifier (RANDSCH) can be used to merge school datasets with the child/family data. Alternatively, the identifier may also be used to merge observations related to a single school.

In the school unit data, separate records exist for each source providing school-based information. The first records in the dataset are from the teachers, followed by information from principals, and then information from the family interview. Another identifier REC_SRC indicates the source of the data with valid entries beginning with "fi" for family interview, "ta" for part a of the teacher questionnaire and "qp" for principal questionnaire.

The recommended method is to link data directly from the school file to the child/family data. However, for users interested only in the school data, school datasets can be arranged such that variables for a particular school appear in a single "school" record. Note that on the flattened child/family, the RANDSCH variable for a specific data collection period begins with the appropriate prefix (i.e., KN_RANDSCH, FG_RANDSCH, SG_RANDSCH, or TG_RANDSCH). Therefore, before linking the school datasets to the flattened child/family dataset, the user should rename the RANDSCH variables on the school datasets with the appropriate prefix for RANDSCH for each observation period.

The five school datasets included on the NTDS CD are labeled with the following file prefixes, followed by the data file type suffix (.sas7bdat for SAS and .sav for SPSS):

SCSP93
SCSP94
SCSP95
SCSP96
SCSP97

Classroom Unit Datasets

The five Classroom Unit datasets correspond to each of the five specific data collection points (fall kindergarten, spring kindergarten, spring first grade, spring second grade, or spring third grade). These datasets are structured with classrooms as the unit of analysis.

Thus, each record contains attributes of a particular classroom, such as teacher characteristics and the aggregate characteristics of its students (e.g., number of female and male children, and the racial-ethnic makeup of the class). Data in the classroom unit datasets are derived from several sources including: classroom observation, teacher questionnaires, and administrative records.

The classroom unit datasets are labeled with the following file prefixes, followed by the file type suffix (.sas7bdat for SAS and .sav for SPSS):

CLASS93
CLASS94
CLASS95
CLASS96
CLASS97

Exit Interview Dataset

Exit interviews were conducted with families, family service specialists, principals and teachers at the end of the third grade data collection period. Interview data are in separate files by data source but are combined across cohorts. The unit of analysis for the exit interview files varies.

The file containing exit interviews completed by the family has the child/family as the unit of analysis similar to the child/family dataset. The family service specialist exit interview is also organized at the child/family level and includes the identifier NEWID so that these data can be merged with the child/family dataset. Exit interviews with principals contain information about each school and thus has the school as the unit of analysis. There are two teacher exit interview datasets. The 'A' file 'eta' contains classroom level information and can be linked with the primary classroom unit datasets, while the 'B' file 'etb' contains child level information which may be linked to the Child/Family datasets.

The exit interview dataset names are labeled with the following file prefixes, followed by the data file type suffix (.sas7.bdat for SAS and .sav for SPSS):

EF
EFS
EQP
ETA
ETB

Note: There is only one data dictionary (called EXITDATA), which is labeled for the child and family respondents but refers to variables in all five datasets.

Community Characteristics Dataset

Data in the community characteristics dataset are organized hierarchically. Records are specific to four types of study units: program site, county, school district, and school. Information in this dataset was collected from both printed and electronic secondary sources rather than from study participants. Because the organization of this dataset is quite complicated, first time users are cautioned to read documentation carefully. The variable RECID can be decomposed into three individual codes for county, district, and school. The first two characters are the county code, the third and fourth characters constitute the district code, and the final three characters are the school code. The community characteristics dataset can be located under the label COMCHARS followed by the data file type suffix (.sas7bdat for SAS and .sav for SPSS).

Program Implementation Profile Dataset

This data file provides information on the implementation of the National Transition Demonstration Project at each of the sites. This information is self-reported by site personnel and covers topics related to the four primary components of the NTDP: social services, family involvement, education, and health.

Create Extract File

Accessing the Data

Data from the National Head Start/Public School Early Childhood Transition Demonstration Study, 1991-1999 is available through the Child Care and Early Education Research Connections project and are restricted from general dissemination. To obtain this data, researchers must agree to the terms and conditions of a Restricted Data Use Agreement. A copy of this agreement and information on application procedures may be obtained by contacting Research Connections staff.

Upon approval, the complete set of datasets, including the aforementioned "flattened file" are sent to researchers on CD.

Before using data on the NTDS, CD files must be unzipped using Winzip. Unzipped data and documentation are stored in a file called Public Data. Click on the folder Public Data. In this folder are several additional folders and documents. For example, the Word document titled DERIVED2.doc contains an 11-page document that describes the scales and assessment questions that appear in the dataset as well as the valid answers for each. Users interested in scores or scales from the NTDS should review this document carefully.

Within the folder Public Data you will also see three folders labeled Convert, PDF, and wordproc. The folders PDF and wordproc contain separate data dictionaries for each dataset described above in .pdf and MSWord formats. In general, the data dictionary documents contain the same label or name as the data files. You will also find a document in each folder labeled ARCHIVNG. This contains the official resource guide for the NTDS data. Detailed information about survey methodology and data elements can be found in this document.

Opening the folder Convert will bring the user to the SAS and SPSS datasets. To access the SPSS datasets, click on the file SPSS within Convert. Each of the datasets in these folders is labeled as described above. SPSS users click on the dataset name to view the data.

SAS users must perform an additional step before viewing the data. In addition to the 22 datasets in this folder, you will notice a yellow folder marked Formats. To view the SAS datasets, the user must open the SAS application and run a short section of code that directs the program to read the data formats before opening the data files. To do this, begin by writing a libname statement assigning a library to the data location (see below). You will need to adjust the libname statement according to the location in which you have stored the data files. Within the quotes " ", replace the letter 'R' with the drive letter on which you have saved the data and then supply the pathnames for that location in place of '\Head_start\Public data\). The filename Convert should remain in this statement as it is the filename provided on the CD. Next, invoke the option fmtsearch followed by an equals (=) sign and the library name (see below). Running these statements allows SAS to open the data files.

SAS code example:

options fmtsearch=(library);
libname library "R:\Head_Start\Pubilc data\Convert";

Linking Records across Datasets

Many users of NTDS data will be interested in utilizing variables from multiple data sources that are located on different datasets. Although the various datasets are organized with different units of analysis, it is possible to merge datasets in order to link child/family records with school or classroom characteristics. Each of the datasets described above contains one or more unique identifiers that can be used to link related records in different datasets.

Following are instructions for linking child records to the school, class, exit, program implementation, and community characteristics datasets. It is recommended that the user merge datasets after limiting each file to the desired group of variables. Each of the NTDS datasets is large and can be unwieldy to the new user. Limiting the number of variables included in the merge will make the final product more manageable.

As noted above, the child/family datasets (KGCOMBIN) contain records at the child level. This dataset contains three important identifiers:

  • NEWID: a unique family identifier

  • RANDSCH: a unique school identifier

  • CLASSCDE: a unique classroom identifier when used in conjunction with RANDSCH

To link child level data to information about that child's school, you will need the school and classroom identifiers. The reason for this is that within the school datasets (SCSP93-97) a separate record exists for each classroom unit. The identifier CLASSCDE is not unique within the school dataset. Merge the child and school datasets such that records are combined only when matched by RANDSCH and CLASSCDE.

After linking child/family data to the relevant school data, the user may attach classroom information using the same approach. Perform a merge of the child/school dataset and the classroom dataset using the school (RANDSCH) and class (CLASSCDE) identifiers. In either set of merges, some additional records may be created when school identifiers are not matched to a child ID. After reviewing the results of the merge, eliminate all records missing a child ID.

The exit interview data is organized into five datasets, each based on information collected from one of the primary data sources (child, family, teacher survey a, and teacher survey b). In order to merge exit data with the core child records, start by determining which exit interviews are desired. You can merge one or more exit interview datasets. To merge data in the ef, efs, and etb datasets, link records using the unique identifier NEWID. Exit interview information in the principals file (eqp) can be linked to the school datasets using the identifier RANDSCH, and the classroom exit information in the eta file can be linked to the classroom datasets using RANDSCH and CLASSCDE.

Finally, community and program characteristics can be appended to the child record. Records in both datasets are linked directly to the school dataset using a site number (SITE NUMBER or SITENO). This merge should take place before linking school data with the child file.

IMPORTANT NOTE - Read This Before Linking Records Across Datasets

Users interested in creating a longitudinal dataset by linking files from more than one collection year will need to recode desired variables. Various datasets included on the NTDS CD contain variables based on questions asked in more than one time period. In these cases, the variable name is identical from one data collection point (and corresponding dataset) to the next. To merge these files without overwriting values, it is necessary to assign new variable names so that variables that exist at each collection point can be distinguished from one another. For example, if the variable HOUSE (type of residence) is desired for more than one data collection point, HOUSE can be renamed to KN_HOUSE, FG_HOUSE, etc. to designate the collection point in which each was collected.

This step can be avoided for the child/family unit data, if the researcher uses the child/family "flattened file" dataset. This flat file dataset includes all of the data from child/family unit datasets for individual observation periods, that is, kindergarten (combined), first grade, second grade and third grade data. As noted above, this dataset is recommended for users interested in longitudinal analysis of child/family data or for researchers interested only in child/family data.

The flattened child/family dataset contains 10,829 child/family cases. Variable names begin with the prefixes KN_, FG_, SG_, and TG_, in order to distinguish variables pertaining to the kindergarten, first grade, second grade, and third grade observation periods. The full name of each variable consists of one of these prefixes followed by the variable name of the corresponding variable, as designated in the original dataset from which it was obtained. The only exception is that the variable name NEWID has not been changed by the addition of a prefix, because this variable uniquely identifies a child/family and therefore does not change values across datasets or data collection periods.

It is recommended that users seeking to merge longitudinal child/family data with other datasets start with the flattened child/family dataset. Prior to implementing these merges, it will be necessary to rename the variables (except for NEWID) in the other datasets with the prefixes KN_, FG_, SG_, and TG_, in order conduct a successful merge and to distinguish variables pertaining to the kindergarten, first grade, second grade, and third grade observation periods.

Learn More

Additional Resources