National Crime Victimization Survey Resource Guide

About NCVS

The National Crime Victimization Survey (NCVS) series, previously called the National Crime Survey (NCS), has been collecting data on personal and household victimization since 1973. An ongoing survey of a nationally representative sample of residential addresses, the NCVS is the primary source of information on the characteristics of criminal victimization and on the number and types of crimes not reported to law enforcement authorities. It provides the largest national forum for victims to describe the impact of crime and characteristics of violent offenders. Twice each year, data are obtained from a nationally representative sample of roughly 49,000 households comprising about 100,000 persons on the frequency, characteristics, and consequences of criminal victimization in the United States. The survey is administered by the U.S. Census Bureau (under the U.S. Department of Commerce) on behalf of the Bureau of Justice Statistics (under the U.S. Department of Justice).

The NCVS was designed with four primary objectives: (1) to develop detailed information about the victims and consequences of crime, (2) to estimate the number and types of crimes not reported to the police, (3) to provide uniform measures of selected types of crimes, and (4) to permit comparisons over time and types of areas. The survey categorizes crimes as "personal" or "property." Personal crimes cover rape and sexual attack, robbery, aggravated and simple assault, and purse-snatching/pocket-picking, while property crimes cover burglary, theft, motor vehicle theft, and vandalism. The data from the NCVS survey are particularly useful for calculating crime rates, both aggregated and disaggregated, and for determining changes in crime rates from year to year.

Using the Resource Guide

NACJD, a part of the Inter-University Consortium for Political and Social Research (ICPSR) at the University of Michigan, designed this Resource Guide for World Wide Web users to learn about the NCVS dataset and to connect to other NCVS information sources.

With this guide, first time users or experienced analysts can:

NCVS Data

The data include type of crime, month, time, and location of the crime, relationship between victim and offender, characteristics of the offender, self-protective actions taken by the victim during the incident and results of those actions, consequences of the victimization, type of property lost, whether the crime was reported to the police and reasons for reporting or not reporting, and offender use of weapons, drugs, and alcohol. Basic demographic information, such as age, race, gender, and income, is also collected to enable analysis of crime by various subpopulations.

Each year of the NCVS is comprised of a hierarchical, or "full," file and various incident-level files. The hierarchical file contains all of the incidents reported as occurring in a given year. Due to the timing of sample interviews, it will also include incidents occurring in the year prior to and the year subsequent to the given year. Incident-level files are prepared from the hierarchical file and are easier to work with. Generally three incident-level files are available: single year incident-level file, concatenated multi-year incident-level file (1992-present), and concatenated multi-year rape subset file (1992-present). Unlike the hierarchical file, the single year incident-level files are "bounded" to include only incidents occurring in the given year. The online version of the NCVS includes the concatenated multi-year incident-level file. Users can easily select a particular year and crime type of interest.

File Structure

Data from the National Crime Victimization Surveys (NCVS) are available from the ICPSR in a variety of forms designed to facilitate their accessibility and analytical use. These include the full data collection, which is stored as a hierarchically structured data file, and a variety of smaller extract files. The organization and use of these files is described below.

Unlike most data files which are stored in a rectangular format (also know as "flat" format) with fixed record lengths, the structured file is organized into a hierarchical format which corresponds to variations in household composition and in the occurrence of incidents of victimization. Hierarchical datasets have varying record lengths, and each record is stored sequentially in the data file. Hierarchical storage greatly reduces the size and space needed to store and process the data. For the NCVS-92-present hierarchical data files, there are four types of records: 1) address ID record, 2) household record, 3) person record, and 4) incident record. The address ID and household records contain information about the household as reported by the respondent and characteristics of the surrounding area as computed by the Bureau of the Census. The person record contains information about each household member age 12 years and older as reported by that person or proxy, with one record for each qualifying individual. Finally, the incident record contains information drawn from the incident report completed for each household incident or person incident mentioned during the interview.

To illustrate, a typical cluster of housing units might contain four households, each with varying numbers of inhabitants some of whom report victimizations. The records for such a cluster might look as follows:

HOUSEHOLD RECORD RECORD TYPE OR CONTENTS
First Household 1 Address ID #1
2 Household #1
3Person #1 of Household #1
4Incident #1 for Person #1 of Household #1
Second Household 5 Address ID #2
6Household #2
7Person #1 of Household #2
8Person #2 of Household #2
9Incident #1 for Person #2 of Household #2
10Incident #2 for Person #2 of Household #2
Third Household 11 Address ID #3
12Household #3
13Person #1 of Household #3
14Person #2 of Household #3
15Incident #1 for Person #2 of Household #3
16Incident #2 for Person #2 of Household #3
17Incident #3 for Person #2 of Household #3
18Person #3 of Household #3
Fourth Household 19 Address ID #4
20Household #4
21Person #1 of Household #4
22Incident #1 for Person #1 of Household #4
23Incident #2 for Person #1 of Household #4
24Person #2 of Household #4

With four different types of records, twenty-four in all, this may seem like a very complex structure. However, storing the data in this way allows for much greater technical efficiency.

If these data were stored in a rectangular fixed format ("flat") data file, a number of problems would have to be dealt with. For example, a household-level dataset may be constructed where the maximum number of persons is two and the maximum number of incidents reported are two per person. Four hypothetical household records from this data - each representing one household - consists of the following: one record with a single person and no incidents reported; a second record with two persons and no incidents reported; a third record with two persons where person one reports one incident; and a fourth record with two persons where person two reports two incidents. A rectangular data structure would require that the fixed record length be large enough to contain all combinations of persons and incidents. In this example, the record length would require one place for the address ID-level data, one place for the household-level data, two places for the person-level data (one place for each person), and two places per person for the incident-level data. Most of this space would consist of missing data padding demanded by the fixed record length.

Assuming that each record type is approximately equal in size, approximately 50% of this small and uncomplicated file would consist of missing data padding. A hierarchical file avoids this wasted space by recording the data in fixed record lengths but only for existing data recorded from the interviews, thus utilizing space more efficiently.

SAS and SPSS program files provide the information needed to describe the composition of the varying record lengths and create datasets or subsets suitable for analysis. To access the appropriate information from a sequenced file, the underlying logic of the data structure must be understood. Using SAS and SPSS software, the four record types found in the NCVS-92-present full hierarchical file are then linked together in a hierarchical pattern, where the address ID record forms the basic building block for the file. The household, person, and incident records are then linked in a subordinate way to "root" this group.

As a link is followed down from the root, the relationship is always one to many. In the NCVS-92-present data, each occurrence of a household will have associated with it any number of persons, and each occurrence of a person will have any number of incidents associated with it. In no case will a person-level record occur without a household or with additional households superordinate to it. The same is true for incidents vis-a-vis persons.

Because of the hierarchical nature of the NCVS-92-present data files, identification or "linkage" variables have been added to each record type. These identification variables can be combined to create a link from the root to the lowest level in the hierarchy, which in the case of these hierarchical files is the incident record. The creation of links between the various record types is crucial when generating various file structures, ensuring accurate retrieval, and permitting file updating. The following is a list of the identification variables found in the NCVS-92-present full hierarchical data files:

LINKAGE VARIABLES FOR RECORD TYPE 1

Identification Variable Description
V1002 ICPSR HOUSEHOLD IDENTIFICATION NUMBER
V1003 YEAR AND QUARTER IDENTIFICATION
V1004 SAMPLE NUMBER
V1005 SCRAMBLED CONTROL NUMBER
V1006 HOUSEHOLD NUMBER
V1007 LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
V1008 PANEL AND ROTATION GROUP

LINKAGE VARIABLES FOR RECORD TYPE 2

Identification Variable Description
V2002 ICPSR HOUSEHOLD IDENTIFICATION NUMBER
V2003 YEAR AND QUARTER IDENTIFICATION
V2004 SAMPLE NUMBER
V2005 SCRAMBLED CONTROL NUMBER
V2006 HOUSEHOLD NUMBER
V2007 LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
V2008 PANEL AND ROTATION GROUP

LINKAGE VARIABLES FOR RECORD TYPE 3

Identification Variable Description
V3002 ICPSR HOUSEHOLD IDENTIFICATION NUMBER
V3003 YEAR AND QUARTER IDENTIFICATION
V3004 SAMPLE NUMBER
V3005 SCRAMBLED CONTROL NUMBER
V3006 HOUSEHOLD NUMBER
V3007 LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
V3008 PANEL AND ROTATION GROUP
V3009 PERSON SEQUENCE NUMBER
V3010 PERSON LINE NUMBER

LINKAGE VARIABLES FOR RECORD TYPE 4

Identification Variable Description
V4002 ICPSR HOUSEHOLD IDENTIFICATION NUMBER
V4003 YEAR AND QUARTER IDENTIFICATION
V4004 SAMPLE NUMBER
V4005 SCRAMBLED CONTROL NUMBER
V4006 HOUSEHOLD NUMBER
V4007 LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
V4008 PANEL AND ROTATION GROUP
V4009 PERSON SEQUENCE NUMBER
V4010 PERSON LINE NUMBER

For example, to ensure that a given person record is linked to its corresponding incident record, the identification variables V3002 through V3010 on the person record could be combined and used to link with the combination of variables V4002 through V4010 found on the incident record. Variables V1001, V2001, V3001, and V4001 are one- digit record type identifiers which indicate group levels for a particular record.

Methodological Issues in the Measurement of Crime

  • Recall, Reference Periods, and Bounding

    The NCVS uses a six month reference period. Respondents are asked to report crime experiences occurring in the last six months. Generally, respondents are able to recall more accurately an event which occurred within three months of the interview rather than one which occurred within six months; they can recall events over a six- month period more accurately than over a 12-month period. However, a shorter reference period would require more field interviews per year, increasing the data collection costs significantly. These increased costs would have to be balanced by cost reductions elsewhere (sample size is often considered). Reducing sample size however, reduces the precision of estimates of relatively rare crimes. In light of these trade-offs of cost and precision, a reference period of six months is used for the NCVS.

    A common concern of researchers employing reference periods in retrospective surveys is telescoping. Telescoping refers to a respondent's misspecification of when an incident occurred in relation to the reference period. For example, telescoping occurs if a respondent is asked about victimizations within the last six months and erroneously includes a victimization that occurred eight months ago. Telescoped events which actually occurred prior to the reference period can be minimized at the time of the first interview by a technique known as bounding. Bounding is achieved by comparing incidents reported in an interview with incidents reported in a previous interview and deleting duplicate incidents that were reported in the current reference period. In the NCS and NCVS designs, each visit to a household is used to bound the next one by comparing reports in the current interview with those given six months prior. When a report appears to be a duplicate, the respondent is reminded of the earlier report and asked if the new report represents the incident previously mentioned or if it is different. The first interview at a household entering the sample is unbounded, and data collected at these interviews are not included in NCS and NCVS estimates. However, if a household in sample moves and another moves into that address, the first interview with the replacement household is unbounded but is included in NCS and NCVS estimates.

  • Proxy Interviews

    Proxy interviews are currently conducted for: (1) household members aged 12 or 13 if a knowledgeable household member insists they not be interviewed directly by the interviewer, (2) persons incapable of responding due to physical or mental incapacity, and (3) those persons who are away from the household during the entire interview period.

Using the Online Survey Documentation and Analysis

The Online Survey Documentation and Analysis is a set of programs for the documentation and web-based analysis of survey data and procedures for creating customized subsets of datasets. Online Data Analysis is available for NCVS only with the concatenated incident-level file, not the full file. The unit of analysis for that file is the crime incident.

Online data analysis is recommended for users who would like to search for variables of interest in a dataset, review frequencies or summary statistics of key variables to determine what further analyses are appropriate, review frequencies or summary statistics for missing data, produce simple summary statistics for reports, create statistical tables from raw data, and those who would like to create custom subsets of cases or variables from a particularly large collection to save time in downloading and space on a personal computer.

Other NCVS Resources

Web Sites

American Statistical Association Committee on Law and Justice Statistics
http://www.amstat.org/comm/index.cfm?fuseaction=commdetails&txtComm=CA02

Bureau of Justice Statistics
http://www.ojp.usdoj.gov/bjs/

Centers for Disease Control Division of Violence Prevention
http://www.cdc.gov/ncipc/dvp/dvp.htm

Crimes Against Children Research Center
http://www.unh.edu/ccrc/

National Center for Juvenile Justice
http://www.ncjj.org/

The National Institute of Justice's Data Resource Program
http://www.nij.gov/funding/data-resources-program/welcome.htm

Office for Victims of Crime
http://www.ojp.usdoj.gov/ovc/

Office of Justice Programs: Violence Against Women and Family Violence Program
http://www.nij.gov/topics/crime/violence-against-women/welcome.htm

Publications

The link below will search the ICPSR citations database for citations of publications with "National Crime Victimization Survey" in the title. Users can create their own searches or browse the citations database through our Publications Bibliography Web page.

Search for NCVS Publications