Missing Data in the Uniform Crime Reports (UCR), 1977-2000 [United States] (ICPSR 32061)
Version Date: Nov 26, 2012 View help for published
Principal Investigator(s): View help for Principal Investigator(s)
Joseph Targonski, University of Illinois-Chicago
https://doi.org/10.3886/ICPSR32061.v1
Version V1
Summary View help for Summary
This study reexamined and recoded missing data in the Uniform Crime Reports (UCR) for the years 1977 to 2000 for all police agencies in the United States. The principal investigator conducted a data cleaning of 20,067 Originating Agency Identifiers (ORIs) contained within the Offenses-Known UCR data from 1977 to 2000. Data cleaning involved performing agency name checks and creating new numerical codes for different types of missing data including missing data codes that identify whether a record was aggregated to a particular month, whether no data were reported (true missing), if more than one index crime was missing, if a particular index crime (motor vehicle theft, larceny, burglary, assault, robbery, rape, murder) was missing, researcher assigned missing value codes according to the "rule of 20", outlier values, whether an ORI was covered by another agency, and whether an agency did not exist during a particular time period.
Citation View help for Citation
Export Citation:
Funding View help for Funding
Subject Terms View help for Subject Terms
Geographic Coverage View help for Geographic Coverage
Smallest Geographic Unit View help for Smallest Geographic Unit
county
Distributor(s) View help for Distributor(s)
Time Period(s) View help for Time Period(s)
Date of Collection View help for Date of Collection
Data Collection Notes View help for Data Collection Notes
-
The principal investigator submitted data for this project in Microsoft Excel format. ICPSR is distributing the Microsoft Excel data so that secondary users can view the color codes developed by the principal investigator for the various forms of missing data. Additionally, ICPSR converted the original Microsoft Excel data into a full suite of formats for preservation and dissemination, including SAS, SPSS, and Stata formats.
-
More detailed information about imputation methodologies in the Offenses-Known Uniform Crime Reports, data cleaning, and the creation and testing of simulation datasets is available in the project's report (Targonski, 2011; NCJ 235152).
Study Purpose View help for Study Purpose
The purpose of this study was to reexamine and recode missing data in the Uniform Crime Reports (UCR) for the years 1977 to 2000 for all police agencies in the United States.
Study Design View help for Study Design
The principal investigator performed a data cleaning of 20,067 Originating Agency Identifiers (ORIs) based on the Offenses-Known Uniform Crime Reporting (UCR) Program Data from 1977 to 2000. The UCR Offenses-Known data collection assembles monthly crime tabulations on what is known as the "Return A" form, which is submitted monthly by police agencies. This includes the crime index, which encompasses murder, rape, robbery, aggravated assault, burglary, larceny, and motor vehicle theft.
The agency-level files from 1977-2000 were merged by the principal investigator using the ORI as the key variable to create a single longitudinal dataset. The longitudinal dataset was further prepared and cleaned by the principal investigator to create the final version that is being distributed as part of this data collection. Data cleaning entailed performing agency name checks, identifying "true missing" values, creating monthly aggregation missing value codes, identifying agencies that are "covered by" another agency, flagging non-existent agencies, creating researcher assigned missing values according to the "rule of 20", and accounting for negative values as well as outlier values. Specifically, the principal investigator performed the following data cleaning tasks:
- Agency name checks were performed to ensure the ORI code for each year refers to one and only one agency and to determine the years in which the ORI existed.
- Any month with a missing value for the Return A variable DATE LAST UPDATE was recoded as a "true missing" value (-99).
- To accurately account for the number of months reported, months that were flagged as missing by the DATE LAST UPDATE were recoded using distinct monthly aggregation missing value codes (-112 through -102).
- Some smaller agencies choose to report their UCR data through a larger neighboring agency, rather than report directly themselves to the FBI or state-reporting agency. This is a "covered by" situation, whereby the larger agency acts as the "covering" agency. For the analysis of missing data when an agency's data was "covered by" another agency, a missing value code (-85) was assigned to months in which the agency was covered by another agency.
- For the years that an ORI was not in existence between 1977 and 2000, another missing value code (-80) was also assigned to the months in which that particular agency did not exist.
- A missing value code (-90) was assigned according to a "rule of 20". The "rule of 20" established that an ORI with an average of 20 or more index crimes per month could not have zero index crimes in a month, if the DATE LAST UPDATE flagged the Return A as being submitted.
- For the purpose of screening outliers in the negative values, -4 was determined as the cutoff for legitimate values. Any values less than -4 were recoded as missing values (-99), since they were most likely data entry errors.
- To identify additional outlier values, as part of the data screening process, each agency's trend was examined graphically. In the process, outliers were detected for the crime index. The outlier values were also recoded as -90.
- For the crimes of motor vehicle theft, larceny, burglary, assault, robbery, rape, and murder, missing data codes (-97 through -91) were assigned if a particular index crime was missing. Additionally, if more than one index crime was missing, it was assigned a separate missing data code (-98).
Sample View help for Sample
The sample consists of 20,067 police agencies in the United States, as identified by all Originating Agency Identifiers (ORIs) in the Offenses-Known Uniform Crime Reporting data from 1977 to 2000.
Time Method View help for Time Method
Universe View help for Universe
All police agencies in the United States between 1977 and 2000.
Unit(s) of Observation View help for Unit(s) of Observation
Data Source View help for Data Source
UNIFORM CRIME REPORTING PROGRAM DATA: 1975-1997 [ICPSR 9028]
UNIFORM CRIME REPORTING PROGRAM DATA: OFFENSES KNOWN AND CLEARANCES BY ARREST, 1999 [ICPSR 3158]
UNIFORM CRIME REPORTING PROGRAM DATA: OFFENSES KNOWN AND CLEARANCES BY ARREST, 1998 [ICPSR 2904]
UNIFORM CRIME REPORTING PROGRAM DATA: OFFENSES KNOWN AND CLEARANCES BY ARREST, 2000 [ICPSR 3447]
Data Type(s) View help for Data Type(s)
Mode of Data Collection View help for Mode of Data Collection
Description of Variables View help for Description of Variables
This study contains a total of 410 variables including an Originating Agency Identifier (ORI) name and code, population totals by year, covering agency by year, statistical metropolitan area by year, county code by year, FBI group by year, and FBI crime index totals by month and year.
Response Rates View help for Response Rates
Not applicable.
Presence of Common Scales View help for Presence of Common Scales
None.
HideOriginal Release Date View help for Original Release Date
2012-11-26
Version History View help for Version History
2018-02-15 The citation of this study may have changed due to the new version control system that has been implemented. The previous citation was:
- Targonski, Joseph. Missing Data in the Uniform Crime Reports (UCR), 1977-2000 [United States]. ICPSR32061-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2012-11-26. http://doi.org/10.3886/ICPSR32061.v1
2012-11-26 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:
- Created variable labels and/or value labels.
- Checked for undocumented or out-of-range codes.
Notes
The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

This dataset is maintained and distributed by the National Archive of Criminal Justice Data (NACJD), the criminal justice archive within ICPSR. NACJD is primarily sponsored by three agencies within the U.S. Department of Justice: the Bureau of Justice Statistics, the National Institute of Justice, and the Office of Juvenile Justice and Delinquency Prevention.