Assessment of Sexual Assault Kit (SAK) Evidence Selection Leading to Development of SAK Evidence Machine-Learning Model (SAK-ML Model), California, Idaho, Utah, 2010-2022 (ICPSR 39161)

Version Date: Jun 26, 2025 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Julie L. Valentine, University of Utah

https://doi.org/10.3886/ICPSR39161.v1

Version V1

Slide tabs to view more

Few studies have explored aggregated DNA analysis findings from sexual assault kits (SAKs) and predictive features of developing useful DNA information related to the foreign contributor(s). Information gleaned from evaluating DNA analysis findings have significant practice and policy implications for both forensic medical examiners/sexual assault nurse examiners and forensic scientists. Results from this innovative study were obtained by tracking SAKs from evidence collection, data from sexual assault medical forensic examinations, through DNA analysis results, and data from publicly funded laboratories.

This study does not include data files. It includes 13 Python files used for statistical analysis.

Valentine, Julie L. Assessment of Sexual Assault Kit (SAK) Evidence Selection Leading to Development of SAK Evidence Machine-Learning Model (SAK-ML Model), California, Idaho, Utah, 2010-2022. Inter-university Consortium for Political and Social Research [distributor], 2025-06-26. https://doi.org/10.3886/ICPSR39161.v1

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
United States Department of Justice. Office of Justice Programs. National Institute of Justice (2019-NE-BX-0001)
Inter-university Consortium for Political and Social Research
Hide

2010 -- 2022 (Utah), 2015 -- 2020 (Orange County), 2013 -- 2020 (Idaho)
2020 -- 2023
  1. The statistical models are available publicly through Zenodo: https://zenodo.org/records/11061747
Hide

The study had two purposes:

  • To evaluate decision-making protocols on DNA evidence contained in sexual assault kits (SAKs) to develop research-based guidelines regarding which swabs and how many swabs should be tested by crime lab.
  • To develop, implement and evaluate a machine learning statistical model, SAK Evidence Machine-Learning Model (SAK-ML Model) to guide forensic scientists within publicly funded forensic laboratories on the selection of the most probative SAK swabs to analyze.
The overarching goal of the study was to extract and analyze information related to SAK evidence collection and analysis to inform practice and policy.

Three publicly funded crime laboratories were collaborative research partners: Utah Bureau of Forensic Services (UBFS), state crime laboratory in Utah; Orange County Crime Lab (OCCL), county crime laboratory in Orange County, California; and Idaho State Police Forensic Services (ISPFS), state crime laboratory in Idaho. As the DNA analysis interpretation methods utilized by crime labs impacts findings, it is important to note that binary interpretation approach was employed during the study period at the sites.

Cross-sectional

Victims age 14 years and older who received a sexual assault medical forensic examinations (SAMFE) from one of the participating forensic nursing teams and had an unrestricted sexual assault kit (SAK) collected. Years of inclusion are 2010-2022 in Utah, 2015-2020 in Orange County, and 2013-2020 in Idaho.

The statistical models are available publicly through Zenodo: https://zenodo.org/records/11061747

Hide

2025-06-26

Hide

Notes

  • These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed.

  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

  • ICPSR usually offers files in multiple formats for researchers to be able to access data and documentation in formats that work well within their needs. If you have questions about the accessibility of materials distributed by ICPSR or require further assistance, please visit ICPSR’s Accessibility Center.

NACJD logo

This dataset is maintained and distributed by the National Archive of Criminal Justice Data (NACJD), the criminal justice archive within ICPSR. NACJD is primarily sponsored by three agencies within the U.S. Department of Justice: the Bureau of Justice Statistics, the National Institute of Justice, and the Office of Juvenile Justice and Delinquency Prevention.