The Source for Crime and Justice Data

Accuracy of NCVS Estimates

The accuracy of an estimate is a measure of its total error, that is, the sum of all the errors affecting the estimate: sampling error as well as nonsampling error.

Sampling Error

The sample used for the NCVS is one of a large number of possible samples of equal size that could have been obtained by using the same sample design and selection procedures. Estimates derived from different samples would differ from one another due to sampling variability, or sampling error.

Nonsampling Error

Estimates may also be subject to nonsampling error. While substantial care is taken in the NCVS to reduce the sources of nonsampling error through out all the survey operations, by means of a quality assurance program, quality controls, operational controls, and error-correcting procedures, an unquantified amount of nonsampling error may still remain.

Major sources of nonsampling error are related to the inability of the respondents to recall in detail the crimes which occurred during the 6 months prior to the interview. Research based on interviews of victims obtained from police files indicates that assault is recalled with the least accuracy of any crime measured by the NCVS. This may be related to the tendency of victims to not report crimes committed by offenders who are not strangers especially if they are relatives. In addition, among certain groups, crimes which contain elements of assault could be a part of everyday life, and are therefore forgotten or not considered important enough to mention to a survey interviewer. These recall problems may result in an understatement of the actual rate of assault.

Another source of nonsampling error is the inability of some respondents to recall the exact month a crime occurred, even though it was placed in the correct reference period. This error source is partially offset by interviewing monthly and using the estimation procedure described earlier.

Telescoping is another problem in which incidents that occurred before the reference period are placed within the period. The effect of telescoping is minimized by using the bounding procedure previously described. The interviewer is provided with a summary of the incidents reported in the preceding interview and, if a similar incident is reported, it can be determined whether or not it is a new one by discussing it with the victim. Events which occurred after the reference period are set aside for inclusion with the data from the following interview.

Other sources of nonsampling error can result from other types of response mistakes, including errors in reporting incidents as crimes, misclassification of crimes, systematic data errors introduced by the interviewer, errors made in coding and processing the data. Quality control and editing procedures were used to minimize the number of errors made by the respondents and the interviewers.

Since field representatives conducting the interviews usually reside in the area in which they interview, the race and ethnicity of the field representatives generally matches that of the local population. Special efforts are made to further match field representatives and the people they interview in areas where English is not commonly spoken. About 90% of all NCVS field representatives are female.

Standard Error Computation

The standard error of a survey estimate is a measure of the variation among that estimate from all possible samples. Therefore, it is a measure of the precision (reliability) with which a particular estimate approximates the average result of all possible samples. The estimate and its associated standard error may be used to construct a confidence interval.

The National Crime Victimization Survey is commonly used to compute estimates of victimization events in the United States population. For example, comparing changes in annual robbery victimization rates would entail computing victimization rates for the years in question, calculating the differences, and then testing these differences to determine whether they were statistically significant. Computing standard errors or variances for these differences is a central task in determining whether these differences may be treated as reliable.

Standard error computation for these rates is problematic on several counts. First, estimates are generally computed not on raw data but on weighted data, which are sums of the weights of the cases selected. To compute the rates for the robbery analysis described above, for example, the sums of the person weights for all respondents experiencing a robbery in each year would be divided by the sums of person weights for all respondents in the same year. Computing variances on these sums as if they were independent observations would produce artifactually small variances and thus would compromise the statistical tests involved.

Second, the NCVS relies on a stratified, multi-stage cluster sample design. Clustering of households in the sample results in variances that are smaller than would be achieved with a simple random sample design. Failure to take account of this "design effect" when using standard statistical tests will make accurate testing difficult, because standard tests assume independence among observations and a simple, random sample design.

Before the 1992 NCVS redesign implementation, a set of standard errors was estimated using the Taylor Expansion Method and was generalized to be applicable to all survey estimates. Generalized curves were produced from subsets of computed standard errors, and formulae developed for these curves, which would allow analysts to calculate the standard error of any survey estimate. These formulae and the constant applicable for each year and general domain of crime have been provided for each year of published NCS data.

Beginning with the 1992 NCVS data, a Jackknife Repeated Replication procedure was used to estimate standard errors, and a new set of formulae and constants was provided to calculate standard errors. Study of NCVS data had determined that the design effect was not constant across all survey estimates and that the new procedures provided a more accurate method for computing standard errors.

Standard error can be calculated in four different ways, depending on the type of estimate involved. Each of these ways requires a different formula and a set of constants that varies between year and domain of crime. These four types of crime estimates are as follows:

The Bureau of Justice Statistics is currently developing a guide to allow a wider audience to work with the NCVS data set. This guide will include a comprehensive description of how to calculate victimization estimates from counts, significance testing programs, gvf parameters, rho values and other valuable NCVS analysis information. In addition, the codebook which is downloadable with the data contains quite a lot of detailed information about the NCVS dataset.

Weighting Information

Because the data collected by the National Crime Victimization Survey represent the total U.S. population 12 years and older, each record can be weighted to produce population estimates from the sample cases. Weights are carried in survey records and are numbers which one adds or accumulates to obtain universe estimates of particular events. The final weight is a multiplier that indicates how many times a particular sample record is to be counted. The NCVS provides three classes of weights -- household, personal, and incident -- and the appropriate weight to select for an analysis depends on the type of crime, victim, or unit of analysis involved. These weights are defined as:

PERSON WEIGHT: Attached to the person record, this variable provides an estimate of the population represented by each person in the sample. Person weights are most frequently used to compute estimates of crime victimizations of persons in the total population. Person weights are used to form the denominator in a calculation of crime rates.

HOUSEHOLD WEIGHT: This weight is attached to the household record and is the weight of the "Principal Person" in the household. In husband-wife households, this is the weight for the wife, excluding the within-household non-interview adjustment (see below). For other households, the household weight is that of the individual identified as owning, buying, or renting the dwelling (the "Reference Person"), excluding the within-household non-interview adjustment. This weight is most commonly used to calculate estimates of property crimes, such as motor vehicle theft or burglary, that are identified with the household.

INCIDENT WEIGHT: This weight is attached to the incident record. For personal crimes, it is derived by dividing the person weight of a victim by the total number of persons victimized during an incident as reported by the respondent. For property crimes, the incident weight and the household weight are the same, because the victim of a property crime is considered to be the household as a whole. The incident weight is most frequently used to compute estimates of the number of crimes committed against a particular class of victim.

Users of NCVS data need to be aware that even though the NCVS data contain the three weights described above, these weights should not simply be used as is. These weights need to be adjusted for the effect of a household's time in sample. The codebook, this resource guide, and BJS contain examples of adjusting weights.

Use of Weighted v. Unweighted Data

BJS analyses of NCVS data typically utilize weighted data. However, analysts have used both weighted and unweighted data to examine victimization-related research questions. Generally, using weighted data increases variance, but decreases sampling bias and bias that is introduced through unequal probabilities of selection and observation. Weighted data are more typically used to compute representative population estimates of victimization dynamics and to compare victimization patterns over time.

Unweighted data have been used to develop and explore predictive models of victimization. However, the analyst must assume that the model is insensitive to small temporal variations in the composition of the NCVS sample and reporting rates and that variables in the model are unrelated to time, if multiple years of data are utilized.

Selecting the correct weight is critical. Weights cannot simply be used as is; they must be adjusted. This adjustment depends on which data format is used (collection year or data year.) See the NCVS codebook for a discussion of weighting.