National Health and Nutrition Examination Survey (NHANES), 2003-2004 (ICPSR 25503)
Principal Investigator(s): United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics
The National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999 the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The surveys examine a nationally representative sample of approximately 5,000 persons each year. These persons are located in counties across the United States, 15 of which are visited each year. For NHANES 2003-2004, there were 12,761 persons selected for the sample, 10,122 of those were interviewed (79.3 percent) and 9,643 (75.6 percent) were examined in the mobile examination centers (MEC). Many of the NHANES 2003-2004 questions were also asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994, and NHANES 1999-2002. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups. As in past health examination surveys, data were collected on the prevalence of chronic conditions in the population. Estimates for previously undiagnosed conditions, as well as those known to and reported by survey respondents, are produced through the survey. Risk factors, those aspects of a person's lifestyle, constitution, heredity, or environment that may increase the chances of developing a certain disease or condition, were examined. Data on smoking, alcohol consumption, sexual practices, drug use, physical fitness and activity, weight, and dietary intake were collected. Information on certain aspects of reproductive health, such as use of oral contraceptives and breastfeeding practices, were also collected. The diseases, medical conditions, and health indicators that were studied include: anemia, cardiovascular disease, diabetes and lower extremity disease, environmental exposures, equilibrium, hearing loss, infectious diseases and immunization, kidney disease, mental health and cognitive functioning, nutrition, obesity, oral health, osteoporosis, physical fitness and physical functioning, reproductive history and sexual behavior, respiratory disease (asthma, chronic bronchitis, emphysema), sexually transmitted diseases, skin diseases, and vision. The sample for the survey was selected to represent the United States population of all ages. Special emphasis in the 2003-2004 NHANES was on adolescent health and the health of older Americans. To produce reliable statistics for these groups, adolescents aged 15-19 years and persons aged 60 years and older were over-sampled for the survey. African Americans and Mexican Americans were also over-sampled to enable accurate estimates for these groups. Several important areas in adolescent health, including nutrition and fitness and other aspects of growth and development, were addressed. Since the United States has experienced dramatic growth in the number of older people during the twentieth century, the aging population has major implications for health care needs, public policy, and research priorities. NCHS is working with public health agencies to increase the knowledge of the health status of older Americans. NHANES has a primary role in this endeavor. In the examination, all participants visit the physician who takes their pulse or blood pressure. Dietary interviews and body measurements are included for everyone. All but the very young have a blood sample taken and see the dentist. Depending upon the age of the participant, the rest of the examination includes tests and procedures to assess the various aspects of health listed above. Usually, the older the individual, the more extensive the examination. Some persons who are unable or unwilling to come to the examination center may be given a less extensive examination in their homes. Demographic data file variables are grouped into three broad categories: (1) Status Variables: provide core information on the survey participant. Examples of the core variables include interview status, examination status, and sequence number. (Sequence number is a unique ID assigned to each sample person and is required to match the information on this demographic file to the rest of the NHANES 2003-2004 data). (2) Recoded Demographic Variables: these variables include age (age in months for persons through age 19 years, 11 months; age in years for 1- to 84-year-olds, and a top-coded age group of 85 years of age and older), gender, a race/ethnicity variable, current or highest grade of education completed, (less than high school, high school, and more than high school education), country of birth (United States, Mexico, or other foreign born), Poverty Income Ratio (PIR), income, and a pregnancy status variable (adjudicated from various pregnancy related variables). Some of the groupings were made due to limited sample sizes for the two-year data set. (3) Interview and Examination Sample Weight Variables: sample weights are available for analyzing NHANES 2003-2004 data. For a complete listing of survey contents for all years of the NHANES see the document -- Survey Content -- NHANES 1999-2010.
These data are freely available.
This study is maintained and distributed by the National Archive of Computerized Data on Aging (NACDA), the aging program within ICPSR. NACDA is sponsored by the National Institute on Aging (NIA) at the National Institutes of Heath (NIH).
WARNING: Because this study has many datasets, the download all files option has been suppressed, and you will need to download one dataset at a time.
WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics. National Health and Nutrition Examination Survey (NHANES), 2003-2004. ICPSR25503-v6. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2012-02-22. http://doi.org/10.3886/ICPSR25503.v6
Persistent URL: http://doi.org/10.3886/ICPSR25503.v6
Scope of Study
Subject Terms: acculturation, aging, alcohol consumption, allergies, anxiety, cardiovascular disease, cognitive functioning, consumer behavior, demographic characteristics, depression (psychology), diabetes, diet, disease, drug use, emotional states, emotional support, ethnicity, eyesight, health behavior, health care, health insurance, health services utilization, health status, hearing (physiology), hospitalization, illness, immunization, income, malnutrition, medical evaluation, mental health, nutrition, occupations, physical fitness, populations, pregnancy, prescription drugs, reproductive history, respiratory diseases, risk factors, sexual behavior, sleep disorders, smoking, social indicators, social support, treatment, tuberculosis, vaccines
Geographic Coverage: United States
Date of Collection:
Unit of Observation: individual
Universe: The NHANES target population is the civilian, noninstitutionalized United States population.
Data Types: clinical data, survey data
Data Collection Notes:
NCHS provides continuous updates/new data notification, as well as other important information for the NHANES. It is recommended that users of these data sign up for the information through the NHANES Listserv. The "What's New" page on the NHANES Web site provides updates/new information which may not be included in the listserv emails. Further, not all documentation files are included with this ICPSR release and may be found at the NHANES 2003-2004 Web site.
In preparing the data files for this collection, the National Center for Health Statistics (NCHS) has removed direct identifiers and characteristics that might lead to identification of data subjects. As an additional precaution NCHS requires, under Section 308(d) of the Public Health Service Act (42 U.S.C. 242m), that data collected by NCHS not be used for any purpose other than statistical analysis and reporting. NCHS further requires that analysts not use the data to learn the identity of any persons or establishments and that the director of NCHS be notified if any identities are inadvertently discovered. ICPSR member institutions and other users ordering data from ICPSR are expected to adhere to these restrictions.
Many variables that are listed in the Demographic questionnaire sections of the Household Interview were omitted (by NCHS) from this data release due to concerns about participant confidentiality. NCHS did not include confidential and administrative data in this release and further, some variables have been recoded or top-coded to protect the confidentiality of survey participants.
Many of the NHANES 2003-2004 questions were also asked in NHANES II, 1976-1980, Hispanic HANES 1982-1984, and NHANES III, 1988-1994. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups.
NHANES 2003-2004 survey design and demographic variables are found in Part 1 Demographics file in this release. All of the data files can be linked by using the common survey participant identification number (variable name: SEQN). Merging information from multiple NHANES 2003-2004 data files using SEQN ensures that the appropriate information for each survey participant is linked correctly. All data files should be sorted by SEQN.
The NHANES 2003-2004 data files do not have the same number of records in each file. For example, there are different numbers of subjects in the Interview and Examination samples of the survey. Additionally, the number of records in each data file varies depending on gender and age profiles for the specific component(s).
The sample person demographic file is composed of a limited set of core variables that are required to analyze NHANES 2003-2004 data.
Per agreement with NCHS, ICPSR distributes the data file(s) and text of the technical documentation for this collection as prepared by NCHS.
IMPORTANT NOTE CONCERNING THE AVAILABILITY OF DATA FILES: Dataset 37 is so large that it exceeds the software limitations for the Stata data file format (file extension .dta) and SAS Cport Transport file format (file extension .stc). As such, these files are not available. Users requiring these files are encouraged to utilize the ASCII version of the data file, along with the appropriate setup file for the desired software package. However, the setup file may need to be edited so that only a subset of the data is accessed, in an effort to avoid exceeding the software limitations.
All data files have been merged with the demographics file with the exception of those parts that did not contain the linking variable SEQN, which are as follows: Questionnaire: Dietary Supplement Use - Supplement Information (Part 214), Questionnaire: Dietary Supplement Use - Ingredient Information (Part 215), Questionnaire: Dietary Supplement Use - Supplement Blend (Part 216), and Questionnaire: Drug Information (Part 237).
Within the Oral Health (Dentition) File Part 33 variables "OHX02CSC through OHX15CSC and OHX18CSC through OHX31CSC, as well as OHX02SE, OHX03SE, OHX14SE, OHX15SE, OHX18SE, OHX19SE, OHX30SE, and OHX31SE" the NCHS coded multiple conditions per record. Users should consult the documentation and review the labeling of the variables in question for further information.
The user guides that are presently available are comprised of documentation from the NCHS. These user guides do not reflect the merging of each file with the demographics file, as this was done by ICPSR staff.
The old .tsv and .txt files have been taken down.
The most recent series of data collection waves for NHANES began in 1999. Every year, approximately 7,000 individuals, of all ages, are interviewed in their homes and of these, approximately 5,000 complete the health examination component of the survey. A majority of the health examinations are conducted in mobile examination centers (MECs). The MECs provide an ideal setting for the collection of high quality data in a standardized environment. In addition to the MEC examinations, a small number of survey participants receive an abbreviated health examination in their homes if they are unable to come to the MEC. The NHANES target population is the civilian, noninstitutionalized United States population. NHANES 2003-2004 includes over-sampling of low-income persons, adolescents 12-19 years of age, persons 60 years of age and older, African Americans, and Mexican Americans. Initially, households are identified for inclusion in the NHANES sample and an advance letter is mailed to each address informing the occupant(s) that an NHANES interviewer will visit their home. The household interview component is comprised of Screener, Sample Person, and Family interviews, each of which has a separate questionnaire (please refer to the data file documentation). Trained household interviewers administer all of the questionnaires. In most cases, the interview setting was the survey participant's home. The interview data are recorded using the Blaise computer-assisted personal interview (CAPI) system. When the interviewer arrives at the home, he or she shows an official identification badge and briefly explains the purpose of the survey. If the occupant has not seen the advance letter, a copy is given to them to read. The interviewer requests that the occupant answer a brief questionnaire to determine if any household occupants are eligible to participate in NHANES. If eligible individuals are identified, the interviewer proceeds with efforts to recruit these individuals. Initially, the interviewer explains the household questionnaires to all eligible participants 16 years of age and older, informs the potential respondents of their rights, and provides assurances about the confidentiality of the survey data (reiterating what is stated in the advance letter). A majority of the household interviews are conducted during the first contact. If this is inconvenient for the survey participant, an appointment is made to administer the household interview questionnaires later. Household interviews for survey participants under 16 years of age are conducted with a proxy (usually their parent or guardian). If there is no one living in the household who is over 16, participants under 16 years of age are permitted to self-report. Respondents are asked to sign an Interview Consent Form agreeing to participate in the household interview portion of the survey. For participants 16-17 years of age a parent or guardian consents and the child gives his/her assent. After the household interview is completed, the interviewer reviews a second informed consent brochure with the participant. This brochure contains detailed information about the NHANES health examination component. All interviewed persons are asked to complete the health examination component. Those who agree to participate are asked to sign additional consent forms for the health examination component. The interviewer telephones the NHANES field office from the participant's home to schedule an appointment for the examination. The interviewer informs the participants that they will receive remuneration as well as reimbursement for transportation and childcare expenses, if necessary.
There are different target population groups for the topics within and between NHANES questionnaire sections. For example, in the Nutrition and Diet Behavior section, questions pertaining to infant nutrition and breast-feeding were asked of proxy respondents for children 6 years of age and younger, alcohol consumption frequency questions were asked of persons 20 years of age and older, and senior meal program participation questions were asked of respondents 60 years of age and older. Data users should review the survey questionnaire codebooks thoroughly to determine the target populations for each NHANES questionnaire section and sub-section.
The NHANES Health Examination Component
When a participant arrives at the MEC, the MEC Coordinator greets the participant and verifies all pertinent identifier information. Each participant receives a disposable paper gown and a pair of slippers to wear during their examination. Persons six years of age and older are asked to provide a urine specimen. MEC staff direct participants to the rooms where the examination components are conducted. In addition to the MEC Coordinator, each MEC survey team consists of one physician, one dentist, two dietary interviewers, three certified medical technologists, five health technicians, one phlebotomist, two interviewers and one computer data manager. Upon completion of the examination, each examinee is remunerated. Some of the medical findings from the examination are given to the examinees before they leave the MEC. The other reportable survey findings are mailed to participants after the laboratory assays and special tests are completed.
Three MECs are equipped for use in NHANES. Each MEC consists of four large, inter-connected trailer units. An advance team sets up the MECs prior to the start of the survey examinations. Water, sewer, electrical, and communication lines are connected during set-up. The MEC equipment and data collection systems must be checked and calibrated prior to the start of survey data collection. The MECs are open a total of five days per week, and the nonoperational days change on a rotating basis so that appointments can be scheduled on any day of the week. Two examination sessions are conducted daily. For the convenience of the survey participants, appointments can be scheduled during morning, afternoon, or evening hours. The examinations require up to three hours to complete. At any given time during the survey, examinations are conducted at two survey locations simultaneously. Staff vacations are scheduled for periods of about one month at New Years and about two weeks during the summer, leaving ten and one-half months to conduct examinations.
Participants who are 50 years and older or less than 1 year old and are unable or unwilling to travel to the MEC were offered a home examination administered by an examiner from the MEC.
Sample: The NHANES survey design is a stratified, multistage probability sample of the civilian noninstitutionalized United States population. The stages of sample selection are: (1) selection of Primary Sampling Units (PSUs) which are counties or small groups of contiguous counties, (2) segments within PSUs (a block or group of blocks containing a cluster of households), (3) households within segments, and (4) one or more participants within households. A total of 15 PSUs are visited during a 12-month time period. Details of the design and content of each survey are available at the NHANES Web site.
Sample weights are available for analyzing NHANES 2003-2004 data. Most data analyses require either the interviewed sample weight (variable name: WTINT2YR) or examined sample weight (variable name: WTMEC2YR). The two-year sample weights (WTINT2YR, WTMEC2YR) should be used for NHANES 2003-2004 analyses. Use of the correct sample weight for NHANES analyses is extremely important and depends on the variables being used. A good rule of thumb is to use "the least common denominator" approach. With this approach, the analyst checks the variables of interest. The variable that was collected on the smallest number of persons is the "least common denominator," and the sample weight that applies to that variable is the appropriate one to use for that particular analysis. Please refer to the NHANES 2003-2004 Analytic Guidelines provided with the data release files to determine the appropriate analytic methodology.
NCHS September 2006 Version--NHANES Analytic Guidelines
Beginning in 1999, the National Health and Nutrition Examination Survey (NHANES) became a continuous, annual survey rather than the periodic survey that it had been in the past. For a variety of reasons, including disclosure and reliability issues, the survey data are released on public use data files every two years. Thus, the data release cycle for the ongoing (and continuous) NHANES is described as NHANES 1999-2000, NHANES 2001-2002, NHANES 2003-2004, etc. In addition to the analysis of data from any two-year cycle, it is possible to combine two or more "cycles" (e.g., 2003-2004 and 2005-2006) to create NHANES 2003-2006, thus increasing sample size and analytic options. In order to produce estimates with greater statistical reliability, combining two or more two-year cycles of the continuous NHANES is encouraged and strongly recommended. When combining cycles of data, it is extremely important that (1) the user verify that data items collected in all combined years were comparable in wording and methods and (2) use a proper sampling weight. Beginning in 2003, the survey content for each two year period is held as constant as possible to be consistent with the data release cycle. In the first four years of the continuous survey, this was not always the case, and some special data release and data access procedures had to be developed and used for selected survey content collected in "other than two-year" intervals (see the NHANES release policy). The decision on how many years of NHANES data are required for a particular analysis can be summarized by the concept of minimum sample size required. The minimum sample size is determined by the statistic to be estimated (e.g. mean, total, proportion...), the reliability criteria (e.g. 20 or 30 percent relative standard error), the Design Effect for the statistics (DEFF defined as the variance inflation factor), and the degrees of freedom for the standard error estimate. Earlier NHANES surveys were conducted for four or more years and, thus, have larger samples than a two-year cycle of the current continuous NHANES. However, in each of those surveys, many sub-domains did not meet minimum sample size requirements and in those cases the above concerns were (and still are) relevant. When combining two or more two-year cycles of the continuous NHANES, the user should use the following procedure for calculating the appropriate combined sample weights. When combining two or more two-year cycles of the continuous NHANES, the user must calculate new sample weights before beginning any analysis of the data. NCHS will not be calculating and including all possible combinations of multiple two-year cycles of the continuous survey because it would be impractical to produce them and include them on all public release files. Because of a particular issue with Census population estimates, a set of four-year weights was created for the first four years of the continuous NHANES -- 1999-2002. The sample weights for NHANES 1999-2000 were based on population estimates developed by the Bureau of the Census before the Year 2000 Decennial Census counts became available. The two-year sample weights for NHANES 2001-2002 were based on population estimates that incorporate the year 2000 Census counts. The two population estimates were not strictly comparable. To facilitate analysis for these first four years of the continuous NHANES, appropriate four-year sample weights (comparable to Census 2000 counts) were calculated and added to the demographic data files for both 1999-2000 and 2001-2002. These sample weights have the same variable name in each file. For example, for the sample persons for whom there are MEC data items, the variable name for the four-year weight is WTMEC4YR. Thus, users of the earlier release of the NHANES 1999-2000 demographic file must use the updated demographic file to appropriately analyze the combined four-year data 1999-2002. Because NHANES 2003-2004 uses the same year 2000 Census counts as were used for NHANES 2001-2002, there is no need to create special four-year weights for 2001-2004. For a four-year estimate for 2001-2004, one can create a new variable for a four-year weight by assigning half of the two-year weight for 2001-2002 if the person was sampled in 2001-2002 or assigning half of the two-year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the two-year weights for 2003-2004 are comparable to the 2001-2002 weights (in terms of a population basis). For an estimate for the six years 1999-2004, a six-year weight variable can be created by assigning two-thirds of the four-year weight for 1999-2002 if the person was sampled in 1999-2002, or assigning one-third of the two-year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the 2003-2004 weights are also comparable (on a population basis) to the combined four-year weights specifically created for 1999-2002.
This information summarizes the most recent analytic and reporting guidelines that should be used for most NHANES analyses and publications. It is important for users to understand the entire document and to become familiar with statistical issues in the analysis of complex survey data. These suggested guidelines provide a framework to users for producing estimates that conform to the analytic design of the survey. Because statistical methods for analyzing complex survey data are continually evolving, these recommendations may differ slightly from those used by analysts for previous NHANES surveys. It is important to remember that the statistical guidelines in this document are not absolute. When conducting analyses, the analyst needs to use his/her subject matter knowledge (including methodological issues), as well as information about the survey design. The more one deviates from the original analytic categories and original analytic objectives defined in the planning documents, the more important it is to evaluate the results carefully and to interpret the findings cautiously. Future versions of the NHANES Analytic and Reporting Guidelines will include additional topics, such as sample sizes and response rates for each NHANES survey, hypothesis testing, multivariate analysis, and a discussion of the concept of statistical versus practical significance. These are Guidelines not standards. Depending upon the subject matter and statistical efficiency, specific analyses may depart from these guidelines; but the burden of proof for statistical efficiency and for appropriate data interpretation is on the data user/analyst. Again, NHANES data files from the continuous survey are publicly released on a two-year basis (1999-2000, 2001-2002, 2003-2004, etc.) and as small, content specific files. The data files and associated documentations, as well as these analytic guidelines, may be edited and/or updated to reflect new data release files. Users should periodically check the NHANES website to determine if any new or revised data files have been released and if these analytic guidelines have been updated.
Mode of Data Collection: audio computer-assisted self interview (ACASI), computer-assisted personal interview (CAPI), computer-assisted self interview (CASI), face-to-face interview, on-site questionnaire
Presence of Common Scales: DISC -- Predictive Scale
- Performed consistency checks.
- Created variable labels and/or value labels.
- Standardized missing values.
- Checked for undocumented or out-of-range codes.
Original ICPSR Release: 2010-04-07
- 2012-02-22 Data and documentation for the following parts have been updated: 22, 23, 121, 137, 138, 146, 147. Additional updates will occur in the future as the NCHS revises the NHANES data collection frequently.
- 2011-07-12 Part 37 was added separately because of file size.
- 2011-06-22 The majority of the data files in this collection were merged with the demographics file.
- 2010-04-19 The codebook was revised for pt37 to explain that .dta and .stc files will not be available due to the size of the data file and software limitations.
Related Publications (?)
- List all ~678 citations associated with this study
- View citations for the entire series
Most Recent Publications
- Citations exports are provided above.
Export Study-level metadata (does not include variable-level metadata)