The Source for Crime and Justice Data

NIBRS File Structure

Single vs. Multiple Files

NIBRS data as formatted by the FBI are stored in a single file. These data are organized by the various segment levels. There are six main segment levels: administrative, offense, property, victim, offender, and arrestee. Each segment level has a different length and layout. There are seven other segments which appear with less frequency.

NIBRS datasets contain millions of records and require hundreds of megabytes of storage capacity. Significant computing resources are necessary to work with the data in its single file format. In addition, the user must be sophisticated in working with data in complex file types.

Recognizing many differences in computing resources, and that many users will be interested in only one or two segment levels, NACJD has decided to make the data available as multiple files. Each NIBRS segment level in the FBI's single file format has been made into a separate rectangular raw data file. Linkage (key) variables are used to perform analyses that involve two or more segment levels.

If the user is interested in variables contained on one segment level, then the data are easy to work with since each segment level file is simply a rectangular raw data file. SPSS and SAS setups are available to read each segment level. Also, with only one segment level, there is no choice regarding the unit of analysis; it is the segment level.

However, if the user is interested in variables that are contained on two or more segment levels, then it is necessary to merge files, decide on the unit of analysis, and decide on the number of records per level to read. These are decisions that are necessary regardless of whether the data are stored in one large file or in separate files.

Consider an example involving variables from the victim and offender segment levels. Three units of analysis are available: the victim, the offender, and the incident itself. In addition, there may be up to 999 victim records per incident and up to 99 offender records per incident. The vast majority of incidents have only 1 or 2 victims and offenders. If the user does not limit the number of records read per incident, the analysis file will contain mostly missing data and will be excessively large. In fact, the analysis file may be much larger than all the segment levels combined. So in this simple example the user must decide the unit of analysis, the number of records per segment level to read, and then merge the victim and offender files accordingly.

Linkage Variables

To work with variables listed on two or more segment levels, the individual segment level files must be merged. The variables used to merge the files are the originating agency identifier (ORI) and the incident number. All segment levels except the batch headers contain both the ORI and incident number. The batch headers only contain the ORI. Within an individual segment level, these variables taken together uniquely identify a record:

LevelVariables
B1 - Batch Header 1 of 3 ORI
B2 - Batch Header 2 of 3 ORI
B3 - Batch Header 3 of 3 ORI
01 - Administrative ORI, Incident number
02 - Offense ORI, Incident number, UCR Offense code
03 - Property ORI, Incident number, Type of Property Loss, Property Description
04 - Victim ORI, Incident number, Victim sequence number
05 - Offender ORI, Incident number, Offender sequence number
06 - Arrestee ORI, Incident number, Arrestee sequence number
W1 - Incident ORI, Incident number
W3 - Property ORI, Incident number, Type of Property Loss, Property Description
W6 - Arrestee ORI, Incident number, Arrestee sequence number