This study sought to determine the extent that prosecutorial discretion contributes to unwarranted racial and ethnic disparities in felony and misdemeanor cases closed in New York County for 20 months in 2010-2011.
This study took a mixed methods approach to data collection including both the main source of data for this study (the administrative dataset generated by the New York County District Attorney's Office's (DANY) case-management systems) and prosecutorial interviews.
DANY Case Data: Cases from DANY examined in this study included all misdemeanors, violations, infractions disposed of in 2010-2011 but felonies were limited to drug offenses, weapons offenses, domestic violence, burglary, and robbery. All cases were selected using the most serious "screening charge", i.e., the top charge, determined by a reviewing assistant district attorney (ADA) at the Early Case Assessment Bureau (ECAB). Also, data from the DANY human resources department was requested in order to provide information on prosecutors' characteristics including demographics, level of experience and caseload. A total of 222,542 cases were examined. Additionally, randomly selected subsamples of misdemeanor marijuana and felony non-marijuana drug cases were chosen, and information on arrest circumstances and evidence factors was gathered from prosecutors' paper files to supplement analyses.
Prosecutorial Interviews: To learn about case processing at DANY, as well as how prosecutors record information electronically and in case files, researchers interviewed 16 Assistant District Attorneys (ADAs) of varied levels of seniority (there were 647 ADAs who were assigned cases from the dataset) and from different trial bureaus using a semi-structured qualitative questionnaire that took approximately 30 minutes to complete. These interviews also served as an opportunity to talk to these ADAs about the study, including its research questions, data collection, analysis plans, and possible implications for DANY policy and practice.
All cases came from the administrative dataset generated by the New York County District Attorney's Office's (DANY) case-management systems and were provided by DANY's Planning and Management Office.
There were 2,409 drug offenses sampled from the main DANY data. The cases were all selected using the most serious screening charge material, i.e., the top charge, determined by a reviewing assistant district attorney at the Early Case Assessment Bureau. Additionally, the study uses a case-level, as opposed to charge-level analysis, which means that many cases in the dataset have multiple charges and/or counts.
There were 16 prosecutors interviewed from DANY using a semi-structured qualitative questionnaire. The prosecutors came from varied levels of seniority and trial bureaus.
The marijuana misdemeanor offenses data was composed of a random sample of 1,256 marijuana misdemeanor cases stratified by defendants' race and ethnicity. White defendants were over-sampled to ensure the groups of comparable sizes. This sample only included cases disposed as guilty pleas and excluded defendants under the age of 16.
For non-marijuana-felony offenses researchers selected 1,153 cases, including 400 cases with black defendants, 400 cases with Latino defendants, and 353 cases with white defendants (i.e., all white defendants available in the population of cases).
Criminal court cases closed in the New York County Court system from 2010-2011 and Assistant District Attorneys of New York in August of 2012.
Prosecutorial interviews of 16 Assistant District Attorneys in New York County.
Data from the DANY human resources department.
Paper case files in DANY office from 2010-2011.
Administrative dataset generated by the New York County District Attorney's Office's (DANY) case-management systems from 2010-2011.
administrative records data
The data in this study was collected from the District Attorney of New York's (DANY) offices and contains 4 SPSS datasets and 3 supplemental STATA datasets.
DANY_Pop_Data_FINAL.sav: This is considered the main dataset for the study and contains 282 variables and 222,542 cases. The variables include:
- Demographic information on the defendant and victim: Race, age, gender, etc.
- Defendant's criminal background: prior convictions, prior sentences and prior incarceration.
- Information on Assistant District Attorney (ADA): Race, gender, tenure, caseload and current bureau.
- Case details of crime: Evidence, officers involved, type of crime, etc.
- Case outcome: ADA decisions on prosecution including plea offers, arrest charges, community service sentence and hours, minimum and maximum jail time sentences, probation, and fines.
- Neighborhood of arrest: Borough of incident, date of arrest, date case was screened, date of indictment, date of case disposal, household income, on government programs of assistance.
DANY_Pop_Heckman.dta: This file contains 290 variables and 22,542 cases. It contains variables similar to those in DANY_Pop_Data_FINAL.sav with some additional recodes including an outcome variable (Custodial_Sent) recoded in order to perform a Heckman correction of selection bias. The STATA syntax file DANY_Pop_Heckman.do includes commands for the recode and the procedures for the selection bias correction. The included SPSS syntax file Syntax_Pop_Data.sps describes recodes made by the PI to the original DANY database information.
Fel_Drug_FINAL.sav: This dataset includes information on the cases involved with felony drug charges. It contains 374 variables and 1,153 cases. The variables contain information on the crime committed, plea offers made and accepted, ADA recommendations on bail, amount set for bond, whether defendant was released, and defense counsel argument. There was also information on the defendant's demographic information and prior record, ADA's demographic information and caseloads, case charges and indictments, and sentencing.
Fel_Drug_Imputed.dta: This file contains 65,533 cases and 389 variables. The file is similar to Fel_Drug_FINAL.sav but with multiply-imputed cases and some additional recoded variables on number of months defendants were offered pre and post indictments, type of drug recovered by police, and the amount of bond and bail set. The included STATA syntax file Fel_Drug_Regressions_MI.do contains the commands used to perform the multiple imputations, as well as logistic regressions used in the analysis. The included SPSS syntax file Syntax_Felony_Drug_Data.sps describes the recodes performed by the PI.
Felony_Drug_For_IRR.sav: This dataset includes 54 variables and 30 cases. It was compiled using data entered by two researchers independently for the felony drug analysis in order to measure inter-rater reliability for some of the more subjectively interpreted items gathered from the prosecutors' paper case files. The variables include information on the crime committed, evidence, location of arrest, when was defendant detained and arresting offices information.
Misd_Marij_FINAL.sav: This dataset includes information on cases that involve misdemeanor marijuana charges. It contains 289 variables and 1,256 cases. The variables contain information on plea offers and sentencing, details of the crime committed, the charges associated with the case, defendant's demographic information including prior record and ADA's demographic information and caseloads.
Misd_MJ_Imputed.dta: This file contains 293 variables and 22,896 cases. It is similar to Misd_Marji_FINAL.sav with additional multiply-imputed cases and variables on interaction between the defendant's race and number of prior arrests and interaction between the defendant's race and if they had a prior arrest. The included STATA syntax file Misd_MJ_Regressions_MI.do contains the commands used to perform the multiple imputation and logistic regressions used in the analysis. The included SPSS syntax file Syntax_Misd_MJ_Data.sps describes recodes performed by the PI.