National Crime Victimization Survey Resource Guide
The National Crime Victimization Survey (NCVS) series, previously called the National Crime Survey (NCS),
has been collecting data on personal and household victimization since 1973. An ongoing survey of a nationally
representative sample of residential addresses, the NCVS is the primary source of information on the
characteristics of criminal victimization and on the number and types of crimes not reported to law enforcement
authorities. It provides the largest national forum for victims to describe the impact of crime and
characteristics of violent offenders. Twice each year, data are obtained from a nationally representative
sample of roughly 49,000 households comprising about 100,000 persons on the frequency, characteristics, and
consequences of criminal victimization in the United States. The survey is administered by the U.S. Census
Bureau (under the U.S. Department of Commerce) on behalf of the Bureau of Justice Statistics (under the U.S.
Department of Justice).
The NCVS was designed with four primary objectives: (1) to develop detailed information about the victims
and consequences of crime, (2) to estimate the number and types of crimes not reported to the police, (3) to
provide uniform measures of selected types of crimes, and (4) to permit comparisons over time and types of
areas. The survey categorizes crimes as "personal" or "property." Personal crimes cover rape and sexual
attack, robbery, aggravated and simple assault, and purse-snatching/pocket-picking, while property crimes
cover burglary, theft, motor vehicle theft, and vandalism. The data from the NCVS survey are particularly
useful for calculating crime rates, both aggregated and disaggregated, and for determining changes in crime
rates from year to year.
Using the Resource Guide
NACJD, a part of the Inter-University Consortium for Political and
Social Research (ICPSR) at the University of Michigan, designed this Resource Guide for World Wide Web
users to learn about the NCVS dataset and to connect to other NCVS information sources.
With this guide, first time users or experienced analysts can:
The data include type of crime, month, time, and location of the crime, relationship between victim and
offender, characteristics of the offender, self-protective actions taken by the victim during the incident
and results of those actions, consequences of the victimization, type of property lost, whether the crime
was reported to the police and reasons for reporting or not reporting, and offender use of weapons, drugs,
and alcohol. Basic demographic information, such as age, race, gender, and income, is also collected to
enable analysis of crime by various subpopulations.
Each year of the NCVS is comprised of a hierarchical, or "full," file and various incident-level files.
The hierarchical file contains all of the incidents reported as occurring in a given year. Due to the timing
of sample interviews, it will also include incidents occurring in the year prior to and the year subsequent
to the given year. Incident-level files are prepared from the hierarchical file and are easier to work with.
Generally three incident-level files are available: single year incident-level file, concatenated multi-year
incident-level file (1992-present), and concatenated multi-year rape subset file (1992-present). Unlike the
hierarchical file, the single year incident-level files are "bounded" to include only incidents occurring
in the given year. The online version of the NCVS includes the concatenated multi-year incident-level file.
Users can easily select a particular year and crime type of interest.
Data from the National Crime Victimization Surveys (NCVS) are available from the ICPSR in a variety of
forms designed to facilitate their accessibility and analytical use. These include the full data collection,
which is stored as a hierarchically structured data file, and a variety of smaller extract files. The
organization and use of these files is described below.
Unlike most data files which are stored in a rectangular format (also know as "flat" format) with
fixed record lengths, the structured file is organized into a hierarchical format which
corresponds to variations in household composition and in the occurrence of incidents of
victimization. Hierarchical datasets have varying record lengths, and each record is stored
sequentially in the data file. Hierarchical storage greatly reduces the size and space needed
to store and process the data. For the NCVS-92-present hierarchical data files, there are four
types of records: 1) address ID record, 2) household record, 3) person record, and 4) incident
record. The address ID and household records contain information about the household as reported
by the respondent and characteristics of the surrounding area as computed by the Bureau of the
Census. The person record contains information about each household member age 12 years and older
as reported by that person or proxy, with one record for each qualifying individual. Finally,
the incident record contains information drawn from the incident report completed for each
household incident or person incident mentioned during the interview.
To illustrate, a typical cluster of housing units might contain four households, each with varying numbers
of inhabitants some of whom report victimizations. The records for such a cluster might look as follows:
||RECORD TYPE OR CONTENTS
||Address ID #1
|3||Person #1 of Household #1|
|4||Incident #1 for Person #1 of Household #1|
||Address ID #2
|7||Person #1 of Household #2|
|8||Person #2 of Household #2|
|9||Incident #1 for Person #2 of Household #2|
|10||Incident #2 for Person #2 of Household #2|
||Address ID #3
|13||Person #1 of Household #3|
|14||Person #2 of Household #3|
|15||Incident #1 for Person #2 of Household #3|
|16||Incident #2 for Person #2 of Household #3|
|17||Incident #3 for Person #2 of Household #3|
|18||Person #3 of Household #3|
||Address ID #4
|21||Person #1 of Household #4|
|22||Incident #1 for Person #1 of Household #4|
|23||Incident #2 for Person #1 of Household #4|
|24||Person #2 of Household #4|
With four different types of records, twenty-four in all, this may seem like a very complex structure.
However, storing the data in this way allows for much greater technical efficiency.
If these data were stored in a rectangular fixed format ("flat") data file, a number of problems would
have to be dealt with. For example, a household-level dataset may be constructed where the maximum number
of persons is two and the maximum number of incidents reported are two per person. Four hypothetical
household records from this data - each representing one household - consists of the following: one record
with a single person and no incidents reported; a second record with two persons and no incidents reported;
a third record with two persons where person one reports one incident; and a fourth record with two persons
where person two reports two incidents. A rectangular data structure would require that the fixed record
length be large enough to contain all combinations of persons and incidents. In this example, the record
length would require one place for the address ID-level data, one place for the household-level data, two
places for the person-level data (one place for each person), and two places per person for the incident-level
data. Most of this space would consist of missing data padding demanded by the fixed record length.
Assuming that each record type is approximately equal in size, approximately 50% of this small and uncomplicated
file would consist of missing data padding. A hierarchical file avoids this wasted space by recording the
data in fixed record lengths but only for existing data recorded from the interviews, thus utilizing space
SAS and SPSS program files provide the information needed to describe the composition of the varying record
lengths and create datasets or subsets suitable for analysis. To access the appropriate information from a
sequenced file, the underlying logic of the data structure must be understood. Using SAS and SPSS software,
the four record types found in the NCVS-92-present full hierarchical file are then linked together in a
hierarchical pattern, where the address ID record forms the basic building block for the file. The household,
person, and incident records are then linked in a subordinate way to "root" this group.
As a link is followed down from the root, the relationship is always one to many. In the NCVS-92-present data,
each occurrence of a household will have associated with it any number of persons, and each occurrence of
a person will have any number of incidents associated with it. In no case will a person-level record occur
without a household or with additional households superordinate to it. The same is true for incidents vis-a-vis
Because of the hierarchical nature of the NCVS-92-present data files, identification or "linkage"
variables have been added to each record type. These identification variables can be combined to
create a link from the root to the lowest level in the hierarchy, which in the case of these
hierarchical files is the incident record. The creation of links between the various record types
is crucial when generating various file structures, ensuring accurate retrieval, and permitting
file updating. The following is a list of the identification variables found in the NCVS-92-present full
hierarchical data files:
LINKAGE VARIABLES FOR RECORD TYPE 1
|V1002 ||ICPSR HOUSEHOLD IDENTIFICATION NUMBER
|V1003 ||YEAR AND QUARTER IDENTIFICATION
|V1004 ||SAMPLE NUMBER
|V1005 ||SCRAMBLED CONTROL NUMBER
|V1006 ||HOUSEHOLD NUMBER
|V1007 ||LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
|V1008 ||PANEL AND ROTATION GROUP
LINKAGE VARIABLES FOR RECORD TYPE 2
|V2002 ||ICPSR HOUSEHOLD IDENTIFICATION NUMBER
|V2003 ||YEAR AND QUARTER IDENTIFICATION
|V2004 ||SAMPLE NUMBER
|V2005 ||SCRAMBLED CONTROL NUMBER
|V2006 ||HOUSEHOLD NUMBER
|V2007 ||LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
|V2008 ||PANEL AND ROTATION GROUP
LINKAGE VARIABLES FOR RECORD TYPE 3
|V3002 ||ICPSR HOUSEHOLD IDENTIFICATION NUMBER
|V3003 ||YEAR AND QUARTER IDENTIFICATION
|V3004 ||SAMPLE NUMBER
|V3005 ||SCRAMBLED CONTROL NUMBER
|V3006 ||HOUSEHOLD NUMBER
|V3007 ||LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
|V3008 ||PANEL AND ROTATION GROUP
|V3009 ||PERSON SEQUENCE NUMBER
|V3010 ||PERSON LINE NUMBER
LINKAGE VARIABLES FOR RECORD TYPE 4
|V4002 ||ICPSR HOUSEHOLD IDENTIFICATION NUMBER
|V4003 ||YEAR AND QUARTER IDENTIFICATION
|V4004 ||SAMPLE NUMBER
|V4005 ||SCRAMBLED CONTROL NUMBER
|V4006 ||HOUSEHOLD NUMBER
|V4007 ||LAST TWO DIGITS OF SCRAMBLED CONTROL NO.
|V4008 ||PANEL AND ROTATION GROUP
|V4009 ||PERSON SEQUENCE NUMBER
|V4010 ||PERSON LINE NUMBER
For example, to ensure that a given person record is linked to its corresponding incident record, the
identification variables V3002 through V3010 on the person record could be combined and used to link with
the combination of variables V4002 through V4010 found on the incident record. Variables V1001, V2001, V3001,
and V4001 are one- digit record type identifiers which indicate group levels for a particular record.
Methodological Issues in the Measurement of Crime
Recall, Reference Periods, and Bounding
The NCVS uses a six month reference period. Respondents are asked to report crime experiences
occurring in the last six months. Generally, respondents are able to recall more accurately
an event which occurred within three months of the interview rather than one which occurred
within six months; they can recall events over a six- month period more accurately than over
a 12-month period. However, a shorter reference period would require more field interviews
per year, increasing the data collection costs significantly. These increased costs would
have to be balanced by cost reductions elsewhere (sample size is often considered). Reducing
sample size however, reduces the precision of estimates of relatively rare crimes. In light
of these trade-offs of cost and precision, a reference period of six months is used for the NCVS.
A common concern of researchers employing reference periods in retrospective surveys is
telescoping. Telescoping refers to a respondent's misspecification of when an incident
occurred in relation to the reference period. For example, telescoping occurs if a respondent
is asked about victimizations within the last six months and erroneously includes a
victimization that occurred eight months ago. Telescoped events which actually occurred
prior to the reference period can be minimized at the time of the first interview by a
technique known as bounding. Bounding is achieved by comparing incidents reported in an
interview with incidents reported in a previous interview and deleting duplicate incidents
that were reported in the current reference period.
In the NCS and NCVS designs, each visit to a household is used to bound the next one by comparing reports
in the current interview with those given six months prior. When a report appears to be a duplicate, the
respondent is reminded of the earlier report and asked if the new report represents the incident previously
mentioned or if it is different. The first interview at a household entering the sample is unbounded, and
data collected at these interviews are not included in NCS and NCVS estimates. However, if a household in
sample moves and another moves into that address, the first interview with the replacement household is
unbounded but is included in NCS and NCVS estimates.
Proxy interviews are currently conducted for: (1) household members aged 12 or 13 if a knowledgeable
household member insists they not be interviewed directly by the interviewer, (2) persons incapable of
responding due to physical or mental incapacity, and (3) those persons who are away from the household
during the entire interview period.
Using the Online Survey Documentation and Analysis
The Online Survey Documentation and Analysis is a set of
programs for the documentation and web-based analysis of survey
data and procedures for creating customized subsets of datasets. Online Data Analysis is
available for NCVS only with the concatenated incident-level
file, not the full file. The unit of analysis for that file is the crime incident.
Online data analysis is recommended for users who would like to
search for variables of interest in a dataset, review frequencies or summary statistics of key variables
to determine what further analyses are appropriate, review frequencies or summary statistics for missing
data, produce simple summary statistics for reports, create statistical tables from raw data, and those
who would like to create custom subsets of cases or variables from a particularly large
collection to save time in downloading and space on a personal computer.
Other NCVS Resources
American Statistical Association Committee on Law and Justice Statistics
Bureau of Justice Statistics
Centers for Disease Control Division of Violence Prevention
Crimes Against Children Research Center
National Center for Juvenile Justice
The National Institute of Justice's Data Resource Program
Office for Victims of Crime
Office of Justice Programs: Violence Against Women and Family Violence
The link below will search the ICPSR citations database for citations of publications with "National
Crime Victimization Survey" in the title. Users can create their own searches or browse the citations
database through our Publications Bibliography Web page.
Search for NCVS Publications