National Health and Nutrition Examination Survey (NHANES), 2007-2008 (ICPSR 25505)
Principal Investigator(s): United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics
The National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999 the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The surveys examine a nationally representative sample of approximately 5,000 persons each year. These persons are located in counties across the United States, 15 of which are visited each year. For NHANES 2007-2008, there were 12,946 persons selected for the sample, 10,149 of those were interviewed (78.4 percent) and 9,762 (75.4 percent) were examined in the mobile examination centers (MEC). Many of the NHANES 2007-2008 questions were also asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994, and NHANES 1999-2006. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups. As in past health examination surveys, data were collected on the prevalence of chronic conditions in the population. Estimates for previously undiagnosed conditions, as well as those known to and reported by survey respondents, are produced through the survey. Risk factors, those aspects of a person's lifestyle, constitution, heredity, or environment that may increase the chances of developing a certain disease or condition, were examined. Data on smoking, alcohol consumption, sexual practices, drug use, physical fitness and activity, weight, and dietary intake were collected. Information on certain aspects of reproductive health, such as use of oral contraceptives and breastfeeding practices, were also collected. The diseases, medical conditions, and health indicators that were studied include: anemia, cardiovascular disease, diabetes and lower extremity disease, environmental exposures, equilibrium, hearing loss, infectious diseases and immunization, kidney disease, mental health and cognitive functioning, nutrition, obesity, oral health, osteoporosis, physical fitness and physical functioning, reproductive history and sexual behavior, respiratory disease (asthma, chronic bronchitis, emphysema), sexually transmitted diseases, skin diseases, and vision. The sample for the survey was selected to represent the United States population of all ages. The NHANES target population is the civilian, noninstitutionalized United States population. Beginning in 2007, some changes were made to the domains being oversampled. The primary change is the oversampling of the entire Hispanic population instead of just the Mexican American (MA) population, which has been oversampled since 1988. Sufficient numbers of MAs were retained in the sample design so that trends in the health of MAs can continue to be monitored. Persons 60 years of age and older, Blacks, and low income persons were also oversampled. In addition, for each of the race/ethnicity domains, the 12-15 and 16-19 year age domains were combined and the 40-59 year age minority domains were split into 10-year age domains of 40-49 and 50-59. This has led to an increase in the number of participants aged 40 and older and a decrease in 12- to 19-year-olds from previous cycles. The oversample of pregnant women and adolescents in the survey from 1999-2006 was discontinued to allow for the oversampling of the Hispanic population. NCHS is working with public health agencies to increase knowledge of the health status of older Americans. NHANES has a primary role in this endeavor. In the examination, all participants visit the physician who takes their pulse or blood pressure. Dietary interviews and body measurements are included for everyone. All but the very young have a blood sample taken and see the dentist. Depending upon the age of the participant, the rest of the examination includes tests and procedures to assess the various aspects of health listed above. Usually, the older the individual, the more extensive the examination. Demographic data file variables are grouped into three broad categories: (1) Status Variables: Provide core information on the survey participant. Examples of the core variables include interview status, examination status, and sequence number. (Sequence number [SEQN] is a unique ID number assigned to each sample person and is required to match the information on this demographic file to the rest of the NHANES 2007-2008 data.) (2) Recoded Demographic Variables: The variables include age (age in months for persons under age 80, age in years for 1 to 80-year-olds, and a top-coded age group of 80 years and older), gender, a race/ethnicity variable, an current or highest grade of education completed, (less than high school, high school, and more than high school education), country of birth (United States, Mexico, or other foreign born), ratio of family income to poverty threshold, income, and a pregnancy status variable (adjudicated from various pregnancy-related variables). Some of the groupings were made due to limited sample sizes for the two-year dataset. (3) Interview and Examination Sample Weight Variables: Sample weights are available for analyzing NHANES 2007-2008 data. Most data analyses require either the interviewed sample weight (variable name: WTINT2YR) or examined sample weight (variable name: WTMEC2YR). The two-year sample weights (WTINT2YR, WTMEC2YR) should be used for NHANES 2007-2008 analyses.
These data are freely available.
WARNING: Because this study has many datasets, the download all files option has been suppressed, and you will need to download one dataset at a time.
WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics. National Health and Nutrition Examination Survey (NHANES), 2007-2008. ICPSR25505-v3. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2012-02-22. http://doi.org/10.3886/ICPSR25505.v3
Persistent URL: http://doi.org/10.3886/ICPSR25505.v3
Scope of Study
Subject Terms: acculturation, aging, alcohol consumption, allergies, anxiety, cardiovascular disease, cognitive functioning, consumer behavior, demographic characteristics, depression, diabetes, diet, disease, drug use, emotional states, emotional support, ethnicity, health behavior, health care, health insurance, health services utilization, health status, hearing (physiology), hospitalization, immunization, income, malnutrition, medical conditions, medical evaluation, mental health, nutrition, occupations, physical fitness, populations, pregnancy, prescription drugs, reproductive history, respiratory diseases, risk factors, sexual behavior, sleep disorders, smoking, social indicators, social support, treatment, tuberculosis, vaccines, vision
Geographic Coverage: United States
Date of Collection:
Unit of Observation: individual
Universe: The NHANES target population is the civilian, noninstitutionalized United States population.
Data Types: clinical data, survey data
Data Collection Notes:
NCHS provides continuous updates/new data notification, as well as other important information for the NHANES. It is recommended that users of these data sign up for the information through the NHANES Listserv. The "What's New" page on the NHANES Web site provides updates/new information which may not be included in the listserv emails. Further, not all documentation files are included with this ICPSR release and may be found at the NHANES 2007-2008 Web site.
In preparing the data files for this collection, the National Center for Health Statistics (NCHS) has removed direct identifiers and characteristics that might lead to identification of data subjects. As an additional precaution NCHS requires, under Section 308(d) of the Public Health Service Act (42 U.S.C. 242m), that data collected by NCHS not be used for any purpose other than statistical analysis and reporting. NCHS further requires that analysts not use the data to learn the identity of any persons or establishments and that the director of NCHS be notified if any identities are inadvertently discovered. ICPSR member institutions and other users ordering data from ICPSR are expected to adhere to these restrictions.
Many variables that are listed in the Demographic questionnaire sections of the Household Interview were omitted (by NCHS) from this data release due to concerns about participant confidentiality. NCHS did not include confidential and administrative data in this release and further, some variables have been recoded or top-coded to protect the confidentiality of survey participants.
Many of the NHANES 2007-2008 questions were also asked in NHANES II, 1976-1980, Hispanic HANES 1982-1984, NHANES III, 1988-1994, and NHANES 1999-2006. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups.
NHANES 2007-2008 survey design and demographic variables are found in Part 1 Demographics file in this release. All of the data files can be linked by using the common survey participant identification number (variable name: SEQN). Merging information from multiple NHANES 2007-2008 data files using SEQN ensures that the appropriate information for each survey participant is linked correctly. All data files should be sorted by SEQN.
The NHANES 2007-2008 data files do not have the same number of records in each file. For example, there are different numbers of subjects in the Interview and Examination samples of the survey. Additionally, the number of records in each data file varies depending on gender and age profiles for the specific component(s).
The sample person demographic file is composed of a limited set of core variables that are required to analyze NHANES 2007-2008 data.
Per agreement with NCHS, ICPSR distributes the data file(s) and text of the technical documentation for this collection as prepared by NCHS.
Note on the 2007-2008 Data Documentation: Changes have been made to the format of the Data Documentation, Codebook, and Frequencies for 2007- 2008. Revised DHHS and CDC Regulations require that public-use documents that are published on the CDC Web site must be made available in a format that is compliant with section 508 of the Rehabilitation Act. Due to the tabular format of codebook items, it is no longer practical to publish data documentation in Adobe Portable Document Format (PDF) in a manner that is compliant with Section 508. Starting with the 2007-2008 data release, data documentation will be published in HTML format. The document in its new format contains the same information as it has in previous years. Data documentation for previous years will also be retrofitted to the new HTML format. Some changes you may notice: The cover page with color graphics, introduced with the release of 2003-2004 data, has been eliminated. The Record Locator has been moved to the front of the document, following the title information. The section bookmarks included in earlier PDF documents have been eliminated. The new format has been optimized for printing. The document displays in a Web browser without pagination as a single, continuous page consistent with its HTML format. However, when viewed using the print preview function, headers, footers, and page numbers should be displayed, as these are configured by your local Web browser settings. Codebook items have been formatted to place one or two on a page, in order not to break items over page boundaries. In the case of a few items that contain more entries than will fit on a single page, column headings have been carried over to the continuation page. Actual behavior may vary according to the vendor, version, and settings of a web browser. As these are beyond the control of the CDC, users may need to consult their local expert if further information or assistance are needed.
The user guides that are presently available are comprised of documentation from the NCHS. These user guides do not reflect the merging of each file with the demographics file, as this was done by ICPSR staff.
The most recent series of data collection waves for NHANES began in 1999. Every year, approximately 7,000 individuals, of all ages, are interviewed in their homes and of these, approximately 5,000 complete the health examination component of the survey. The health examinations are conducted in mobile examination centers (MECs). The MECs provide an ideal setting for the collection of high quality data in a standardized environment. The NHANES target population is the civilian, noninstitutionalized United States population. In 2007-2008 a new sampling methodology was implemented. All Hispanics were oversampled, not just Mexican Americans. In addition, for each of the race/ethnicity domains, the 12-15 and 16-19 year age domains were combined and the 40-59 year age minority domains were split into 10-year age domains of 40-49 and 50-59. This has led to an increase in the number of participants aged 40 and older and a decrease in 12- to19-year-olds from previous cycles. Lastly, pregnant women are no longer oversampled. Based on these changes some variables have been modified from previous release cycles. Initially, households are identified for inclusion in the NHANES sample and an advance letter is mailed to each address informing the occupant(s) that an NHANES interviewer will visit their home. The household interview component is comprised of Screener, Sample Person, and Family interviews, each of which has a separate questionnaire (please refer to the data file documentation). Trained household interviewers administer all of the questionnaires. In most cases, the interview setting was the survey participant's home. The interview data are recorded using the Blaise computer-assisted personal interview (CAPI) system. When the interviewer arrives at the home, he or she shows an official identification badge and briefly explains the purpose of the survey. If the occupant has not seen the advance letter, a copy is given to them to read. The interviewer requests that the occupant answer a brief questionnaire to determine if any household occupants are eligible to participate in NHANES. If eligible individuals are identified, the interviewer proceeds with efforts to recruit these individuals. Initially, the interviewer explains the household questionnaires to all eligible participants 16 years of age and older, informs the potential respondents of their rights, and provides assurances about the confidentiality of the survey data (reiterating what is stated in the advance letter). A majority of the household interviews are conducted during the first contact. If this is inconvenient for the survey participant, an appointment is made to administer the household interview questionnaires later. Household interviews for survey participants under 16 years of age are conducted with a proxy (usually their parent or guardian). If there is no one living in the household who is over 16, participants under 16 years of age are permitted to self-report. Respondents are asked to sign an Interview Consent Form agreeing to participate in the household interview portion of the survey. For participants 16-17 years of age a parent or guardian consents and the child gives his/her assent. After the household interview is completed, the interviewer reviews a second informed consent brochure with the participant. This brochure contains detailed information about the NHANES health examination component. All interviewed persons are asked to complete the health examination component. Those who agree to participate are asked to sign additional consent forms for the health examination component. The interviewer telephones the NHANES field office from the participant's home to schedule an appointment for the examination. The interviewer informs the participants that they will receive remuneration as well as reimbursement for transportation and childcare expenses, if necessary.
There are different target population groups for the topics within and between NHANES questionnaire sections. For example, in the Nutrition and Diet Behavior section, questions pertaining to infant nutrition and breast-feeding were asked of proxy respondents for children 6 years of age and younger, alcohol consumption frequency questions were asked of persons 20 years of age and older, and senior meal program participation questions were asked of respondents 60 years of age and older. Data users should review the survey questionnaire codebooks thoroughly to determine the target populations for each NHANES questionnaire section and subsection.
The NHANES Health Examination Component
When a participant arrives at the MEC, s/he is greeted by the MEC Coordinator, who is responsible for seeing to it that the SP receives all the appropriate exams for his/her gender and age. The SP changes from street clothes into a paper gown, trousers, and slippers provided by the MEC. S/he is then given an ID bracelet with an identification number and escorted from the reception area to each of the exam locations within the MEC. Persons six years of age and older are asked to provide a urine specimen. MEC staff direct participants to the rooms where the examination components are conducted. In addition to the MEC Coordinator, each MEC survey team consists of one physician, one dentist, two dietary interviewers, three certified medical technologists, five health technicians, one phlebotomist, two interviewers and one computer data manager. Upon completion of the examination, each examinee is remunerated. Some of the medical findings from the examination are given to the examinees before they leave the MEC. The other reportable survey findings are mailed to participants after the laboratory assays and special tests are completed.
Three MECs are equipped for use in NHANES. Each MEC consists of four large, inter-connected trailer units. An advance team sets up the MECs prior to the start of the survey examinations. Water, sewer, electrical, and communication lines are connected during set-up. The MEC equipment and data collection systems must be checked and calibrated prior to the start of survey data collection. The MECs are open a total of five days per week, and the nonoperational days change on a rotating basis so that appointments can be scheduled on any day of the week. Two examination sessions are conducted daily. For the convenience of the survey participants, appointments can be scheduled during morning, afternoon, or evening hours. The examinations require up to three hours to complete. At any given time during the survey, examinations are conducted at two survey locations simultaneously. Staff vacations are scheduled for periods of about one month at New Years and about two weeks during the summer, leaving ten and one-half months to conduct examinations.
Sample: The NHANES survey design is a stratified, multistage probability sample of the civilian noninstitutionalized United States population. The stages of sample selection are: (1) selection of Primary Sampling Units (PSUs) which are counties or small groups of contiguous counties, (2) segments within PSUs (a block or group of blocks containing a cluster of households), (3) households within segments, and (4) one or more participants within households. A total of 15 PSUs are visited during a 12-month time period. Details of the design and content of each survey are available at the NHANES Web site.
Sample weights are available for analyzing NHANES 2007-2008 data. Most data analyses require either the interviewed sample weight (variable name: WTINT2YR) or examined sample weight (variable name: WTMEC2YR). The two-year sample weights (WTINT2YR, WTMEC2YR) should be used for NHANES 2007-2008 analyses. Please refer to the NHANES Analytic Guidelines provided with the data release files to determine the appropriate methodology for analyses of combined years of data.
NCHS September 2006 Version--NHANES Analytic Guidelines
Beginning in 1999, the National Health and Nutrition Examination Survey (NHANES) became a continuous, annual survey rather than the periodic survey that it had been in the past. For a variety of reasons, including disclosure and reliability issues, the survey data are released on public use data files every two years. Thus, the data release cycle for the ongoing (and continuous) NHANES is described as NHANES 1999-2000, NHANES 2001-2002, NHANES 2003-2004, etc. In addition to the analysis of data from any two-year cycle, it is possible to combine two or more "cycles" (e.g., 2003-2004 and 2005-2006) to create NHANES 2003-2006, thus increasing sample size and analytic options. In order to produce estimates with greater statistical reliability, combining two or more two-year cycles of the continuous NHANES is encouraged and strongly recommended. When combining cycles of data, it is extremely important that (1) the user verify that data items collected in all combined years were comparable in wording and methods and (2) use a proper sampling weight. Beginning in 2003, the survey content for each two-year period is held as constant as possible to be consistent with the data release cycle. In the first four years of the continuous survey, this was not always the case, and some special data release and data access procedures had to be developed and used for selected survey content collected in "other than two-year" intervals (see the NHANES release policy). The decision on how many years of NHANES data are required for a particular analysis can be summarized by the concept of minimum sample size required. The minimum sample size is determined by the statistic to be estimated (e.g., mean, total, proportion...), the reliability criteria (e.g., 20 or 30 percent relative standard error), the Design Effect for the statistics (DEFF defined as the variance inflation factor), and the degrees of freedom for the standard error estimate. Earlier NHANES surveys were conducted for four or more years and, thus, have larger samples than a two-year cycle of the current continuous NHANES. However, in each of those surveys, many subdomains did not meet minimum sample size requirements and in those cases the above concerns were (and still are) relevant. When combining two or more two-year cycles of the continuous NHANES, the user should use the following procedure for calculating the appropriate combined sample weights. When combining two or more two-year cycles of the continuous NHANES, the user must calculate new sample weights before beginning any analysis of the data. NCHS will not be calculating and including all possible combinations of multiple two-year cycles of the continuous survey because it would be impractical to produce them and include them on all public release files. Because of a particular issue with Census population estimates, a set of four-year weights was created for the first four years of the continuous NHANES -- 1999-2002. The sample weights for NHANES 1999-2000 were based on population estimates developed by the Bureau of the Census before the Year 2000 Decennial Census counts became available. The two-year sample weights for NHANES 2001-2002 were based on population estimates that incorporate the year 2000 Census counts. The two population estimates were not strictly comparable. To facilitate analysis for these first four years of the continuous NHANES, appropriate four-year sample weights (comparable to Census 2000 counts) were calculated and added to the demographic data files for both 1999-2000 and 2001-2002. These sample weights have the same variable name in each file. For example, for the sample persons for whom there are MEC data items, the variable name for the four-year weight is WTMEC4YR. Thus, users of the earlier release of the NHANES 1999-2000 demographic file must use the updated demographic file to appropriately analyze the combined four-year data 1999-2002. Because NHANES 2003-2004 uses the same year 2000 Census counts as were used for NHANES 2001-2002, there is no need to create special four-year weights for 2001-2004. For a four-year estimate for 2001-2004, one can create a new variable for a four-year weight by assigning half of the two-year weight for 2001-2002 if the person was sampled in 2001-2002 or assigning half of the two-year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the two-year weights for 2003-2004 are comparable to the 2001-2002 weights (in terms of a population basis). For an estimate for the six years 1999-2004, a six-year weight variable can be created by assigning two-thirds of the four-year weight for 1999-2002 if the person was sampled in 1999-2002, or assigning one-third of the two-year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the 2003-2004 weights are also comparable (on a population basis) to the combined four-year weights specifically created for 1999-2002.
This information summarizes the most recent analytic and reporting guidelines that should be used for most NHANES analyses and publications. It is important for users to understand the entire document and to become familiar with statistical issues in the analysis of complex survey data. These suggested guidelines provide a framework to users for producing estimates that conform to the analytic design of the survey. Because statistical methods for analyzing complex survey data are continually evolving, these recommendations may differ slightly from those used by analysts for previous NHANES surveys. It is important to remember that the statistical guidelines in this document are not absolute. When conducting analyses, the analyst needs to use his/her subject matter knowledge (including methodological issues), as well as information about the survey design. The more one deviates from the original analytic categories and original analytic objectives defined in the planning documents, the more important it is to evaluate the results carefully and to interpret the findings cautiously. Future versions of the NHANES Analytic and Reporting Guidelines will include additional topics, such as sample sizes and response rates for each NHANES survey, hypothesis testing, multivariate analysis, and a discussion of the concept of statistical versus practical significance. These are Guidelines not standards. Depending upon the subject matter and statistical efficiency, specific analyses may depart from these guidelines; but the burden of proof for statistical efficiency and for appropriate data interpretation is on the data user/analyst. Again, NHANES data files from the continuous survey are publicly released on a two-year basis (1999-2000, 2001-2002, 2003-2004, etc.) and as small, content specific files. The data files and associated documentations, as well as these analytic guidelines, may be edited and/or updated to reflect new data release files. Users should periodically check the NHANES Web site to determine if any new or revised data files have been released and if these analytic guidelines have been updated.
Mode of Data Collection: audio computer-assisted self interview (ACASI), computer-assisted personal interview (CAPI), computer-assisted self interview (CASI), face-to-face interview, on-site questionnaire
Presence of Common Scales: DISC -- Predictive Scale
- Performed consistency checks.
- Created variable labels and/or value labels.
- Standardized missing values.
- Checked for undocumented or out-of-range codes.
Original ICPSR Release: 2010-05-04
- 2012-02-22 Data and documentation for the following parts have been updated: 121, 123. Additional updates will occur in the future as the NCHS revises the NHANES data collection frequently.
- 2011-09-07 All data files in this collection were merged with the demographics file.
Related Publications (?)
- List all ~147 citations associated with this study
- View citations for the entire series
Most Recent Publications
- Citations exports are provided above.
Export Study-level metadata (does not include variable-level metadata)