Step 3: How to Prepare NACJD Data Files for Use with a GIS Software Package
NACJD distributes data as ASCII data files. An ASCII data file consists of rows and columns of alphanumeric
characters. Since ASCII data files are simply text files, they can be opened in any word processing program or
Internet browser. However, the alphanumeric characters are not meaningful without the help of a codebook or data
definition statements to define the columns of the ASCII data file as specific variables.
This figure illustrates the raw ASCII file for the Crimes Reported Data (Part 4) of the
Uniform Crime Reporting Program Data [United States]: County-Level Detailed Arrest and Offense Data,
1999 (ICPSR 3167). More information about how to understand ASCII data files can be found at the web
page on How do I interpret a record from an ASCII data file?.
NACJD data files are also fixed-width files, which means that the user has to specify the column locations of
each variable (as opposed to delimited files in which variables are separated by a comma or a tab). These files
cannot be read directly into a mapping software package such as ArcView or MapInfo without first making some changes
to the files.
There are three necessary steps to preparing an NACJD fixed-width ASCII data file for importing into a GIS
The NACJD data file must contain a variable (field) that can be matched to a variable in the map file
so that the two files can be joined together. The user must identify the variable in both the map file and
the NACJD data file to ensure that the contents of each variable match. The variable name does not have to be
the same in each file, but the contents must match. For example, in order to add the Crimes Reported data from
the 1999 UCR county-level data collection as attribute data to an existing GIS, there must be a variable in this
data file that will link to the geography of the map file. Map files of U.S. counties from ArcView and MapInfo
contain FIPS codes that uniquely identify each county. The 1999 UCR data file also contains FIPS codes that can
be matched with the map file. If you do not know how to view the variables of map files in ArcView or Map Info,
this is explained in Step 4 of this tutorial.
The variable that will be used to link the NACJD data file to the map file must be specified and formatted
to match the map file. For example, map files of U.S. counties from ArcView of MapInfo have one field containing
a five-digit FIPS code. The first two digits of the FIPS code designate the state and the last three digits specify
the county. The field is formatted to preserve leading zeroes. For example, the FIPS code for Autauga, Alabama
is designated as a five-digit code of "01001" and not " 1001." The Crimes Reported file from the 1999 UCR
county-level data collection contains two FIPS variables: a two-digit state variable and a three-digit county
variable. These two variables can be combined into one five-digit variable that can also be formatted to preserve
the leading zeroes. This will produce a variable that can be matched to the map file.
The NACJD ASCII data file must be saved in a file format that can be read by the GIS software. Both ArcView
and MapInfo accept tab-delimited ASCII files (*.txt) and dBASE (*.dbf) files and MapInfo also supports Lotus 1-2-3,
Microsoft Access, and Microsoft Excel files. This tutorial uses dBASE files as examples; however, the same steps
described below can be used to save files in the other acceptable formats, if desired.
These manipulations to the NACJD data file can be done with statistical, spreadsheet, or database software. This
tutorial will provide examples of how to make the necessary changes to the NACJD data file with SAS for Windows, SPSS
for Windows, Microsoft Excel, and Microsoft Access.