Log In/Create Account

Analyze & Subset--Study No. 25503

Title: National Health and Nutrition Examination Survey (NHANES), 2003-2004

Online Analysis Using SDA

The online analysis system allows you to run both simple and complex analyses, recode and compute new variables, and subset variables or cases for downloading. The software powering the system, named Survey Documentation and Analysis (SDA), was developed by the Computer-assisted Survey Methods Program (CSM) at the University of California, Berkeley.

Click on the link(s) below to begin using SDA.


Please note that weights may affect analysis results.

Sample weights are available for analyzing NHANES 2003-2004 data. Most data analyses require either the interviewed sample weight (variable name: WTINT2YR) or examined sample weight (variable name: WTMEC2YR). The two-year sample weights (WTINT2YR, WTMEC2YR) should be used for NHANES 2003-2004 analyses. Use of the correct sample weight for NHANES analyses is extremely important and depends on the variables being used. A good rule of thumb is to use "the least common denominator" approach. With this approach, the analyst checks the variables of interest. The variable that was collected on the smallest number of persons is the "least common denominator," and the sample weight that applies to that variable is the appropriate one to use for that particular analysis. Please refer to the NHANES 2003-2004 Analytic Guidelines provided with the data release files to determine the appropriate analytic methodology.

NCHS September 2006 Version--NHANES Analytic Guidelines

Beginning in 1999, the National Health and Nutrition Examination Survey (NHANES) became a continuous, annual survey rather than the periodic survey that it had been in the past. For a variety of reasons, including disclosure and reliability issues, the survey data are released on public use data files every two years. Thus, the data release cycle for the ongoing (and continuous) NHANES is described as NHANES 1999-2000, NHANES 2001-2002, NHANES 2003-2004, etc. In addition to the analysis of data from any two-year cycle, it is possible to combine two or more "cycles" (e.g., 2003-2004 and 2005-2006) to create NHANES 2003-2006, thus increasing sample size and analytic options. In order to produce estimates with greater statistical reliability, combining two or more two-year cycles of the continuous NHANES is encouraged and strongly recommended. When combining cycles of data, it is extremely important that (1) the user verify that data items collected in all combined years were comparable in wording and methods and (2) use a proper sampling weight. Beginning in 2003, the survey content for each two year period is held as constant as possible to be consistent with the data release cycle. In the first four years of the continuous survey, this was not always the case, and some special data release and data access procedures had to be developed and used for selected survey content collected in "other than two-year" intervals (see the NHANES release policy). The decision on how many years of NHANES data are required for a particular analysis can be summarized by the concept of minimum sample size required. The minimum sample size is determined by the statistic to be estimated (e.g. mean, total, proportion...), the reliability criteria (e.g. 20 or 30 percent relative standard error), the Design Effect for the statistics (DEFF defined as the variance inflation factor), and the degrees of freedom for the standard error estimate. Earlier NHANES surveys were conducted for four or more years and, thus, have larger samples than a two-year cycle of the current continuous NHANES. However, in each of those surveys, many sub-domains did not meet minimum sample size requirements and in those cases the above concerns were (and still are) relevant. When combining two or more two-year cycles of the continuous NHANES, the user should use the following procedure for calculating the appropriate combined sample weights. When combining two or more two-year cycles of the continuous NHANES, the user must calculate new sample weights before beginning any analysis of the data. NCHS will not be calculating and including all possible combinations of multiple two-year cycles of the continuous survey because it would be impractical to produce them and include them on all public release files. Because of a particular issue with Census population estimates, a set of four-year weights was created for the first four years of the continuous NHANES -- 1999-2002. The sample weights for NHANES 1999-2000 were based on population estimates developed by the Bureau of the Census before the Year 2000 Decennial Census counts became available. The two-year sample weights for NHANES 2001-2002 were based on population estimates that incorporate the year 2000 Census counts. The two population estimates were not strictly comparable. To facilitate analysis for these first four years of the continuous NHANES, appropriate four-year sample weights (comparable to Census 2000 counts) were calculated and added to the demographic data files for both 1999-2000 and 2001-2002. These sample weights have the same variable name in each file. For example, for the sample persons for whom there are MEC data items, the variable name for the four-year weight is WTMEC4YR. Thus, users of the earlier release of the NHANES 1999-2000 demographic file must use the updated demographic file to appropriately analyze the combined four-year data 1999-2002. Because NHANES 2003-2004 uses the same year 2000 Census counts as were used for NHANES 2001-2002, there is no need to create special four-year weights for 2001-2004. For a four-year estimate for 2001-2004, one can create a new variable for a four-year weight by assigning half of the two-year weight for 2001-2002 if the person was sampled in 2001-2002 or assigning half of the two-year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the two-year weights for 2003-2004 are comparable to the 2001-2002 weights (in terms of a population basis). For an estimate for the six years 1999-2004, a six-year weight variable can be created by assigning two-thirds of the four-year weight for 1999-2002 if the person was sampled in 1999-2002, or assigning one-third of the two-year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the 2003-2004 weights are also comparable (on a population basis) to the combined four-year weights specifically created for 1999-2002.

This information summarizes the most recent analytic and reporting guidelines that should be used for most NHANES analyses and publications. It is important for users to understand the entire document and to become familiar with statistical issues in the analysis of complex survey data. These suggested guidelines provide a framework to users for producing estimates that conform to the analytic design of the survey. Because statistical methods for analyzing complex survey data are continually evolving, these recommendations may differ slightly from those used by analysts for previous NHANES surveys. It is important to remember that the statistical guidelines in this document are not absolute. When conducting analyses, the analyst needs to use his/her subject matter knowledge (including methodological issues), as well as information about the survey design. The more one deviates from the original analytic categories and original analytic objectives defined in the planning documents, the more important it is to evaluate the results carefully and to interpret the findings cautiously. Future versions of the NHANES Analytic and Reporting Guidelines will include additional topics, such as sample sizes and response rates for each NHANES survey, hypothesis testing, multivariate analysis, and a discussion of the concept of statistical versus practical significance. These are Guidelines not standards. Depending upon the subject matter and statistical efficiency, specific analyses may depart from these guidelines; but the burden of proof for statistical efficiency and for appropriate data interpretation is on the data user/analyst. Again, NHANES data files from the continuous survey are publicly released on a two-year basis (1999-2000, 2001-2002, 2003-2004, etc.) and as small, content specific files. The data files and associated documentations, as well as these analytic guidelines, may be edited and/or updated to reflect new data release files. Users should periodically check the NHANES website to determine if any new or revised data files have been released and if these analytic guidelines have been updated.

" xslt=" (link)
  • ">

    If you're having trouble with SDA utilities, you may wish to consult the online help files for SDA users provided by the Computer-assisted Survey Methods Program (CSM) at the University of California, Berkeley.