Development and Validation of an Actuarial Risk Assessment Tool for Juveniles with a History of Sexual Offending, 5 U.S. states, 2009-2013 (ICPSR 38335)

Version Date: Aug 30, 2022 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
KiDeuk Kim, Urban Institute

https://doi.org/10.3886/ICPSR38335.v1

Version V1

Slide tabs to view more

Because there are few existing tools for assessing the risk of recidivism for youth with a history of sexual offending that are empirically valid and reliable, knowledge and practice in this area has historically been limited. This project examined current practice and policy in the assessment, treatment, and management of juvenile sex offenders across multiple jurisdictions (Florida, New York, Oregon, Pennsylvania, and Virginia). The researchers developed a prototype assessment tool, state-specific risk assessment models, and practical guidance for building a risk assessment for sexual recidivism in juvenile justice settings.

The data file contains individual records for the full sample (n=8,035), including their risk predictors, recidivism measures, and resulting outputs (i.e., predicted probabilities of sexual recidivism) from the risk models.

Kim, KiDeuk. Development and Validation of an Actuarial Risk Assessment Tool for Juveniles with a History of Sexual Offending, 5 U.S. states, 2009-2013. Inter-university Consortium for Political and Social Research [distributor], 2022-08-30. https://doi.org/10.3886/ICPSR38335.v1

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
United States Department of Justice. Office of Justice Programs. National Institute of Justice (2013-AW-BX-0053)

State

Access to these data is restricted. Users interested in obtaining these data must complete a Restricted Data Use Agreement, specify the reasons for the request, and obtain IRB approval or notice of exemption for their research.

Inter-university Consortium for Political and Social Research
Hide

2009-01-01 -- 2013-12-31
Hide

The primary objective of this study was to advance the field of juvenile sex offender management and treatment. The researchers aimed to develop and validate a state-of-the-art actuarial risk assessment instrument that effectively predicted the risk of sexual recidivism for juveniles with a history of sexual offending. In partnership with five jurisdictions around the country (Florida, New York, Oregon, Pennsylvania, and Virginia), the researchers compiled a dataset of juveniles with a history of sexual offending; they then used the data to construct and validate the risk assessment instrument.

Due to issues with data compatibility across the five jurisdictions, only a subset of the study sample (n=1,836) was used for the construction and validation of the instrument. A series of modeling strategies, including traditional logistical regression approach and machine learning algorithms, were used to predict sexual recidivism (defined as being re-arrested for a sex crime within two years).

This study utilized two methods of data collection:

  1. Collecting historical administrative data from the partner sites, and
  2. Conducting site visits with each partner agency to collect information on current policy and practice in sex offender treatment and management through semi-structured interviews and focus groups.

During the initial visits to each of the partner sites, the project team gathered information about the types of administrative data that each site maintains. Though the availability of specific items varied across sites, the sites typically collected data about prior juvenile and criminal justice involvement, history of delinquency, or referrals to child protection agencies; social history; family and other social supports; substance use/abuse; employment history; school performance and conduct; personality traits; and conditions of supervision and sex offender treatment history. During subsequent visits, the project team interviewed key personnel within that jurisdiction, including juvenile justice agency staff (e.g., treatment directors, probation and/or correctional officers), as well as residential outpatient treatment facility staff (e.g., program directors and clinical staff), to learn about current practice in the assessment, management, and treatment of youth who have sexually offended.

To create the predictive model, the project team tested many different machine learning and more traditional algorithms to identify the best-performing model. The traditional approaches included logistic regression and decision tress, while the machine learning approaches included regularized logistic regression, stochastic gradient descent, artificial neural networks, and support vector machines. For each model, the project team used a five-fold cross validation and tested many different parameters on the training set. The project team then evaluated the performance on the test set by calculating the area under the curve (AUC), overall accuracy, sensitivity, and specificity.

The project team selected an initial set of predictors to be used in model building on the basis of their theoretical salience and availability across the jurisdictions. They conducted a qualitative review on each of the common data fields across the jurisdictions to determine which predictors should be used in the process of model building. The machine learning algorithm then exploited the potential of those theoretically relevant predictors for separating recidivists from non-recidivists.

Cross-sectional ad-hoc follow-up

Juveniles who sexually offended between 2009 and 2013.

Individual

The dataset includes information on youth from each of the five sites whose current disposition was for a sex offense or who had previously been adjudicated delinquent of a sex offense. It provides the following information:

  • Demographic characteristics
  • Current offense
  • Past delinquency and criminal justice involvement
  • School performance
  • Peer associations
  • Personality traits
  • Other factors that are traditionally known in criminological research as relevant to criminal behaviors

It also includes the resulting outputs (i.e., predicted probabilities of sexual recidivism) from the risk models.

Four scales of sexual recidivism risk are included in the data file. The scales were developed from traditional logistic regression and machine learning algorithms, including regularized logistic regression, gradient boosting machine, and artificial neural network.

Hide

2022-08-30

Hide

Notes

  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

  • One or more files in this data collection have special restrictions. Restricted data files are not available for direct download from the website; click on the Restricted Data button to learn more.