Multi-level Analyses of Accuracy and Error in Digital Criminal Record Data, Minnesota and New Jersey, 2017-2019 (ICPSR 38208)
Version Date: Oct 27, 2022 View help for published
Principal Investigator(s): View help for Principal Investigator(s)
Sarah Lageson, Rutgers University
https://doi.org/10.3886/ICPSR38208.v1
Version V1
Summary View help for Summary
This is a three-level analysis of digital criminal record information. Drawing from original mixed methods data involving 178 research participants, this study first describes individual-level qualitative data ("micro-level" results) describing experiences with digital criminal records, touching on issues of criminal record accuracy, digital reputation, and digital avoidance strategies. The second analysis examines nearly 5,000 criminal history events listed on participant's official state criminal records, which consist of official arrest and charging data. In this "meso-level" analysis, these 4,874 criminal history events are tracked across a broad set of public sector and private sector criminal record repositories. Results show how the record keeping practices of two states, Minnesota and New Jersey, translate into extralegal records that exist on the internet and in commercial databases. Further, this meso-level study details the thousands of criminal history events that originate outside state repositories, and instead appear first in commercial vendor databases or internet-based repositories (N=3,368). These erroneous or misleading records are likely the result of mismatched and misunderstood bulk data processing, but still pose practical problems for participants seeking employment, housing, and criminal record expungement. The final section, a "macro-level analysis" of criminal record disclosure, presents results from across the Unites States by reporting the disclosure practices of 200 criminal justice agencies, and estimates the volume of personally identifiable criminal record information disclosed each year on the internet. The current study release only contains the meso-level data.
Citation View help for Citation
Export Citation:
Funding View help for Funding
Subject Terms View help for Subject Terms
Geographic Coverage View help for Geographic Coverage
Smallest Geographic Unit View help for Smallest Geographic Unit
State
Restrictions View help for Restrictions
Access to these data is restricted. Users interested in obtaining these data must complete a Restricted Data Use Agreement, specify the reasons for the request, and obtain IRB approval or notice of exemption for their research.
Distributor(s) View help for Distributor(s)
Time Period(s) View help for Time Period(s)
Date of Collection View help for Date of Collection
Study Purpose View help for Study Purpose
The purpose of this study is to provide insight on several important observations: the diffuse and localized nature of criminal justice administration, high levels of inaccuracy in criminal history reports, and the persistent use of criminal records in legal and non-legal settings.
Study Design View help for Study Design
This is a mixed-methods study. This study traces the dissemination across data sources of "Criminal History Events," discrete data points that document contact with the justice system (i.e. arrests, booking photos, charges, dismissals, convictions, and punishments). In total, 178 participants released to researchers various versions of their criminal histories, which were then coded at the analytic unit of "criminal history events."
The study proceeds in three analytic components: the micro-level individual experience of criminal justice contact and the resulting data trail; the meso-level production of records by state governments and the drift of these records to the private sector; and the macro-level collection of these millions of records into "big data."
The meso-level data included the state-level production of records and the dissemination of criminal history events across third-party criminal record reporting platforms. Criminal records were obtained from eight sources, digitized, and stored on a secure server. Research Assistants and the PI entered data following a uniform criminal history event protocol for the meso-level analysis of public and private databases. Each single criminal justice incident that appears on any report was coded for: date (arrest date, filing data, and/or conviction date), data source, arrest/charge/conviction type, disposition, punishment, and any probation violations (if applicable). The arrest disposition was coded as "charged" or "uncharged," while the charge disposition ranged from "dismissed" to "guilty" with a minor set of additional outcomes, such as a "stay of adjudication."
These singular criminal history events were then matched across databases. Criminal history events are defined by the date of first appearance of either an arrest or a criminal charge. Each event was coded for its appearance in other platforms, including the state court website, three background check services for Fair Credit Reporting Act compliant reports, one website that produces non-FCRA compliant reports, and two online mugshot galleries. Each research participant's name was also entered into a basic Google search for "name + criminal." The originating arrest or conviction event received a "score" of frequency of appearance across applicable platforms to measure disclosure (does the record appear in the other source?) and accuracy (does the record differ, or does another record appear in the external source?). In other words, scoring each criminal history event for how many times it appears across sources provides an estimate of match and mismatch across platforms. Additionally, any records that appeared in these outside sources (and thus did not originate nor appear on the state criminal record), was entered as separate criminal history events.
Sample View help for Sample
There are 178 criminal record subjects who completed the study (Minneapolis = 74; New Jersey = 104) by submitting their state rap sheet. These rap sheets contain 4,874 criminal history events (Minneapolis = 2,444; New Jersey= 2,430). Participants' mean age is 42.5 years (sd = 13.94). The minimum age is 20, and the maximum is 80. A slight majority of participants are African American, particularly in the New Jersey site, where African Americans comprise nearly 82% of the sample. In contrast, nearly 75% of the sample in Minnesota is white. This reflects the population characteristics of the metropolitan areas from where participants were selected (Newark, NJ and Minneapolis-St. Paul, MN).
Time Method View help for Time Method
Universe View help for Universe
Individuals possessing criminal records in Minnesota and New Jersey.
Unit(s) of Observation View help for Unit(s) of Observation
Data Type(s) View help for Data Type(s)
Mode of Data Collection View help for Mode of Data Collection
Description of Variables View help for Description of Variables
The variables in this study include descriptions of criminal history.
Response Rates View help for Response Rates
Not applicable.
Presence of Common Scales View help for Presence of Common Scales
Not applicable.
HideNotes
The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.
One or more files in this data collection have special restrictions. Restricted data files are not available for direct download from the website; click on the Restricted Data button to learn more.