Step 3: How to Prepare NACJD Data Files for Use with a GIS Software Package

NACJD distributes data as ASCII data files. An ASCII data file consists of rows and columns of alphanumeric characters. Since ASCII data files are simply text files, they can be opened in any word processing program or Internet browser. However, the alphanumeric characters are not meaningful without the help of a codebook or data definition statements to define the columns of the ASCII data file as specific variables.

This figure illustrates the raw ASCII file for the Crimes Reported Data (Part 4) of the Uniform Crime Reporting Program Data [United States]: County-Level Detailed Arrest and Offense Data, 1999 (ICPSR 3167). More information about how to understand ASCII data files can be found at the web page on How do I interpret a record from an ASCII data file?.

raw ASCII file example

NACJD data files are also fixed-width files, which means that the user has to specify the column locations of each variable (as opposed to delimited files in which variables are separated by a comma or a tab). These files cannot be read directly into a mapping software package such as ArcView or MapInfo without first making some changes to the files.

There are three necessary steps to preparing an NACJD fixed-width ASCII data file for importing into a GIS software package:

  1. The NACJD data file must contain a variable (field) that can be matched to a variable in the map file so that the two files can be joined together. The user must identify the variable in both the map file and the NACJD data file to ensure that the contents of each variable match. The variable name does not have to be the same in each file, but the contents must match. For example, in order to add the Crimes Reported data from the 1999 UCR county-level data collection as attribute data to an existing GIS, there must be a variable in this data file that will link to the geography of the map file. Map files of U.S. counties from ArcView and MapInfo contain FIPS codes that uniquely identify each county. The 1999 UCR data file also contains FIPS codes that can be matched with the map file. If you do not know how to view the variables of map files in ArcView or Map Info, this is explained in Step 4 of this tutorial.

  2. The variable that will be used to link the NACJD data file to the map file must be specified and formatted to match the map file. For example, map files of U.S. counties from ArcView of MapInfo have one field containing a five-digit FIPS code. The first two digits of the FIPS code designate the state and the last three digits specify the county. The field is formatted to preserve leading zeroes. For example, the FIPS code for Autauga, Alabama is designated as a five-digit code of "01001" and not " 1001." The Crimes Reported file from the 1999 UCR county-level data collection contains two FIPS variables: a two-digit state variable and a three-digit county variable. These two variables can be combined into one five-digit variable that can also be formatted to preserve the leading zeroes. This will produce a variable that can be matched to the map file.

  3. The NACJD ASCII data file must be saved in a file format that can be read by the GIS software. Both ArcView and MapInfo accept tab-delimited ASCII files (*.txt) and dBASE (*.dbf) files and MapInfo also supports Lotus 1-2-3, Microsoft Access, and Microsoft Excel files. This tutorial uses dBASE files as examples; however, the same steps described below can be used to save files in the other acceptable formats, if desired.

These manipulations to the NACJD data file can be done with statistical, spreadsheet, or database software. This tutorial will provide examples of how to make the necessary changes to the NACJD data file with SAS for Windows, SPSS for Windows, Microsoft Excel, and Microsoft Access.

