The Source for Crime and Justice Data

National Crime Victimization Survey Resource Guide

About NCVS

The National Crime Victimization Survey (NCVS) series, previously called the National Crime Survey (NCS), has been collecting data on personal and household victimization since 1973. An ongoing survey of a nationally representative sample of residential addresses, the NCVS is the primary source of information on the characteristics of criminal victimization and on the number and types of crimes not reported to law enforcement authorities. It provides the largest national forum for victims to describe the impact of crime and characteristics of violent offenders. Twice each year, data are obtained from a nationally representative sample of roughly 49,000 households comprising about 100,000 persons on the frequency, characteristics, and consequences of criminal victimization in the United States. The survey is administered by the U.S. Census Bureau (under the U.S. Department of Commerce) on behalf of the Bureau of Justice Statistics (under the U.S. Department of Justice).

The NCVS was designed with four primary objectives: (1) to develop detailed information about the victims and consequences of crime, (2) to estimate the number and types of crimes not reported to the police, (3) to provide uniform measures of selected types of crimes, and (4) to permit comparisons over time and types of areas. The survey categorizes crimes as "personal" or "property." Personal crimes cover rape and sexual attack, robbery, aggravated and simple assault, and purse-snatching/pocket-picking, while property crimes cover burglary, theft, motor vehicle theft, and vandalism. The data from the NCVS survey are particularly useful for calculating crime rates, both aggregated and disaggregated, and for determining changes in crime rates from year to year.

Using the Resource Guide

NACJD, a part of the Inter-University Consortium for Political and Social Research (ICPSR) at the University of Michigan, designed this Resource Guide for World Wide Web users to learn about the NCVS dataset and to connect to other NCVS information sources.

With this guide, first time users or experienced analysts can:

NCVS Data

The data include type of crime, month, time, and location of the crime, relationship between victim and offender, characteristics of the offender, self-protective actions taken by the victim during the incident and results of those actions, consequences of the victimization, type of property lost, whether the crime was reported to the police and reasons for reporting or not reporting, and offender use of weapons, drugs, and alcohol. Basic demographic information, such as age, race, gender, and income, is also collected to enable analysis of crime by various subpopulations.

Each year of the NCVS is comprised of a hierarchical, or "full," file and various incident-level files. The hierarchical file contains all of the incidents reported as occurring in a given year. Due to the timing of sample interviews, it will also include incidents occurring in the year prior to and the year subsequent to the given year. Incident-level files are prepared from the hierarchical file and are easier to work with. Generally three incident-level files are available: single year incident-level file, concatenated multi-year incident-level file (1992-present), and concatenated multi-year rape subset file (1992-present). Unlike the hierarchical file, the single year incident-level files are "bounded" to include only incidents occurring in the given year. The online version of the NCVS includes the concatenated multi-year incident-level file. Users can easily select a particular year and crime type of interest.

File Structure

Data from the National Crime Victimization Surveys (NCVS) are available from the ICPSR in a variety of forms designed to facilitate their accessibility and analytical use. These include the full data collection, which is stored as a hierarchically structured data file, and a variety of smaller extract files. The organization and use of these files is described below.

Unlike most data files which are stored in a rectangular format (also know as "flat" format) with fixed record lengths, the structured file is organized into a hierarchical format which corresponds to variations in household composition and in the occurrence of incidents of victimization. Hierarchical datasets have varying record lengths, and each record is stored sequentially in the data file. Hierarchical storage greatly reduces the size and space needed to store and process the data. For the NCVS-92-present hierarchical data files, there are four types of records: 1) address ID record, 2) household record, 3) person record, and 4) incident record. The address ID and household records contain information about the household as reported by the respondent and characteristics of the surrounding area as computed by the Bureau of the Census. The person record contains information about each household member age 12 years and older as reported by that person or proxy, with one record for each qualifying individual. Finally, the incident record contains information drawn from the incident report completed for each household incident or person incident mentioned during the interview.

To illustrate, a typical cluster of housing units might contain four households, each with varying numbers of inhabitants some of whom report victimizations. The records for such a cluster might look as follows:

RECORD

RECORD TYPE OR CONTENTS

First Household

1

Address ID #1

2

Household #1

3

Person #1 of Household #1

4

Incident #1 for Person #1 of Household #1

Second Household

5

Address ID #2

6

Household #2

7

Person #1 of Household #2

8

Person #2 of Household #2

9

Incident #1 for Person #2 of Household #2

10

Incident #2 for Person #2 of Household #2

Third Household

11

Address ID #3

12

Household #3

13

Person #1 of Household #3

14

Person #2 of Household #3

15

Incident #1 for Person #2 of Household #3

16

Incident #2 for Person #2 of Household #3

17

Incident #3 for Person #2 of Household #3

18

Person #3 of Household #3

Fourth Household

19

Address ID #4

20

Household #4

21

Person #1 of Household #4

22

Incident #1 for Person #1 of Household #4

23

Incident #2 for Person #1 of Household #4

24

Person #2 of Household #4

With four different types of records, twenty-four in all, this may seem like a very complex structure. However, storing the data in this way allows for much greater technical efficiency.

If these data were stored in a rectangular fixed format ("flat") data file, a number of problems would have to be dealt with. For example, a household-level dataset may be constructed where the maximum number of persons is two and the maximum number of incidents reported are two per person. Four hypothetical household records from this data - each representing one household - consists of the following: one record with a single person and no incidents reported; a second record with two persons and no incidents reported; a third record with two persons where person one reports one incident; and a fourth record with two persons where person two reports two incidents. A rectangular data structure would require that the fixed record length be large enough to contain all combinations of persons and incidents. In this example, the record length would require one place for the address ID-level data, one place for the household-level data, two places for the person-level data (one place for each person), and two places per person for the incident-level data. Most of this space would consist of missing data padding demanded by the fixed record length.

Assuming that each record type is approximately equal in size, approximately 50% of this small and uncomplicated file would consist of missing data padding. A hierarchical file avoids this wasted space by recording the data in fixed record lengths but only for existing data recorded from the interviews, thus utilizing space more efficiently.

SAS and SPSS program files provide the information needed to describe the composition of the varying record lengths and create datasets or subsets suitable for analysis. To access the appropriate information from a sequenced file, the underlying logic of the data structure must be understood. Using SAS and SPSS software, the four record types found in the NCVS-92-present full hierarchical file are then linked together in a hierarchical pattern, where the address ID record forms the basic building block for the file. The household, person, and incident records are then linked in a subordinate way to "root" this group.

As a link is followed down from the root, the relationship is always one to many. In the NCVS-92-present data, each occurrence of a household will have associated with it any number of persons, and each occurrence of a person will have any number of incidents associated with it. In no case will a person-level record occur without a household or with additional households superordinate to it. The same is true for incidents vis-a-vis persons.

Because of the hierarchical nature of the NCVS-92-present data files, identification or "linkage" variables have been added to each record type. These identification variables can be combined to create a link from the root to the lowest level in the hierarchy, which in the case of these hierarchical files is the incident record. The creation of links between the various record types is crucial when generating various file structures, ensuring accurate retrieval, and permitting file updating. The following is a list of the identification variables found in the NCVS-92-present full hierarchical data files:

LINKAGE VARIABLES FOR RECORD TYPE 1

Identification Variable

Description

V1002

ICPSR HOUSEHOLD IDENTIFICATION NUMBER

V1003

YEAR AND QUARTER IDENTIFICATION

V1004

SAMPLE NUMBER

V1005

SCRAMBLED CONTROL NUMBER

V1006

HOUSEHOLD NUMBER

V1007

LAST TWO DIGITS OF SCRAMBLED CONTROL NO.

V1008

PANEL AND ROTATION GROUP

LINKAGE VARIABLES FOR RECORD TYPE 2

Identification Variable

Description

V2002

ICPSR HOUSEHOLD IDENTIFICATION NUMBER

V2003

YEAR AND QUARTER IDENTIFICATION

V2004

SAMPLE NUMBER

V2005

SCRAMBLED CONTROL NUMBER

V2006

HOUSEHOLD NUMBER

V2007

LAST TWO DIGITS OF SCRAMBLED CONTROL NO.

V2008

PANEL AND ROTATION GROUP

LINKAGE VARIABLES FOR RECORD TYPE 3

Identification Variable

Description

V3002

ICPSR HOUSEHOLD IDENTIFICATION NUMBER

V3003

YEAR AND QUARTER IDENTIFICATION

V3004

SAMPLE NUMBER

V3005

SCRAMBLED CONTROL NUMBER

V3006

HOUSEHOLD NUMBER

V3007

LAST TWO DIGITS OF SCRAMBLED CONTROL NO.

V3008

PANEL AND ROTATION GROUP

V3009

PERSON SEQUENCE NUMBER

V3010

PERSON LINE NUMBER

LINKAGE VARIABLES FOR RECORD TYPE 4

Identification Variable

Description

V4002

ICPSR HOUSEHOLD IDENTIFICATION NUMBER

V4003

YEAR AND QUARTER IDENTIFICATION

V4004

SAMPLE NUMBER

V4005

SCRAMBLED CONTROL NUMBER

V4006

HOUSEHOLD NUMBER

V4007

LAST TWO DIGITS OF SCRAMBLED CONTROL NO.

V4008

PANEL AND ROTATION GROUP

V4009

PERSON SEQUENCE NUMBER

V4010

PERSON LINE NUMBER

For example, to ensure that a given person record is linked to its corresponding incident record, the identification variables V3002 through V3010 on the person record could be combined and used to link with the combination of variables V4002 through V4010 found on the incident record. Variables V1001, V2001, V3001, and V4001 are one- digit record type identifiers which indicate group levels for a particular record.

Methodological Issues in the Measurement of Crime

Using the Online Survey Documentation and Analysis

The Online Survey Documentation and Analysis is a set of programs for the documentation and web-based analysis of survey data and procedures for creating customized subsets of datasets. Online Data Analysis is available for NCVS only with the concatenated incident-level file, not the full file. The unit of analysis for that file is the crime incident.

Online data analysis is recommended for users who would like to search for variables of interest in a dataset, review frequencies or summary statistics of key variables to determine what further analyses are appropriate, review frequencies or summary statistics for missing data, produce simple summary statistics for reports, create statistical tables from raw data, and those who would like to create custom subsets of cases or variables from a particularly large collection to save time in downloading and space on a personal computer.

Other NCVS Resources

Web Sites

American Statistical Association Committee on Law and Justice Statistics
http://www.amstat.org/comm/index.cfm?fuseaction=commdetails&txtComm=CA02

Bureau of Justice Statistics
http://www.ojp.usdoj.gov/bjs/

Centers for Disease Control Division of Violence Prevention
http://www.cdc.gov/ncipc/dvp/dvp.htm

Crimes Against Children Research Center
http://www.unh.edu/ccrc/

National Center for Juvenile Justice
http://www.ncjj.org/

The National Institute of Justice's Data Resource Program
http://www.nij.gov/funding/data-resources-program/welcome.htm

Office for Victims of Crime
http://www.ojp.usdoj.gov/ovc/

Office of Justice Programs: Violence Against Women and Family Violence Program
http://www.nij.gov/topics/crime/violence-against-women/welcome.htm

Publications

The link below will search the ICPSR citations database for citations of publications with "National Crime Victimization Survey" in the title. Users can create their own searches or browse the citations database through our Publications Bibliography Web page.

Search for NCVS Publications