Log In/Create Account

Terrorism and Preparedness Data Resource Center

world map

What are setup files?

Many of our data collections that contain ASCII data files are accompanied by setup files that allow users to read the text files into statistical software packages. Since a visual interpretation of alphanumeric data files is inefficient, statistical software is needed to define, manipulate, extract, and analyze variables and cases within data files. We currently provide for many of our data collections setup files for SAS, SPSS, and Stata statistical software packages, three of the more commonly used analytical software packages for the social sciences.

The following instructions explain the different components of SAS, SPSS, and Stata setup files. Setup files for certain collections may not contain all of the commands listed below.

SAS Setup Files

SAS setup files can be used to generate native SAS file formats such as SAS datasets, SAS xport libraries, and transport files. Our SAS setup files generally include the following SAS sections. Click on each section to see an example taken from ICPSR 6512 (Capital Punishment in the United States, 1973-1993).

  1. PROC FORMAT: Creates user-defined formats for the variables. Formats replace original value codes with value code descriptions. Not all variables necessarily have user-defined formats.
  2. DATA: Begins a SAS data step and names an output SAS dataset.
  3. INFILE: Identifies the input data file to be read with the input statement. Users must replace the "physical-filename" with host computer-specific input file specifications. For example, users on Windows platforms should replace "physical-filename" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory "C:\".
  4. INPUT: Assigns the name, type, decimal specification (if any), and specifies the beginning and ending column locations for each variable in the data file.
  5. LABEL: Assigns descriptive labels to all variables. Variable labels and variable names may be identical for some variables.
  6. FORMAT: Associates the formats created by the PROC FORMAT step with the variables named in the INPUT statement.
  7. MISSING VALUE RECODES: Sets user-defined numeric missing values to missing as interpreted by the SAS system. Only variables with user-defined missing values are included in the statements.

SPSS Setup Files

SPSS setup files can be used to generate native SPSS file formats such as SPSS system files and SPSS portable files. SPSS setup files produced by generally include the following SPSS sections. Click on each section to see an example taken from ICPSR 6512 (Capital Punishment in the United States, 1973-1993).

  1. DATA LIST: Assigns the name, type, decimal specification (if any), and specifies the beginning and ending column locations for each variable in the data file. Users must replace the "physical-filename" with host computer-specific input file specifications. For example, users on Windows platforms should replace "physical-filename" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory "C:\".
  2. VARIABLE LABELS: Assigns descriptive labels to all variables. Variable labels and variable names may be identical for some variables.
  3. VALUE LABELS: Assigns descriptive labels to codes in the data file. Not all variables necessarily have assigned value labels.
  4. MISSING VALUES: Declares user-defined missing values. Not all variables in the data file necessarily have user-defined missing values. These values can be treated specially in data transformations, statistical calculations, and case selection.
  5. MISSING VALUE RECODE: Sets user-defined numeric missing values to missing as interpreted by the SPSS system. Only variables with user-defined missing values are included in the statements.

Stata Setup Files

Stata setup files can be used to generate native Stata DTA files. Stata setup files produced by ICPSR generally include the following Stata sections. Click on each section to see an example taken from ICPSR 6512 (Capital Punishment in the United States, 1973-1993).

  1. FILE SPECIFICATIONS: Assigns values to local macros that specify the locations of the files used to build a Stata system file. Users must replace the "physical-filename" with host computer-specific input file specifications. For example; users on Windows platforms should replace "raw-datafile-name" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory of "C:\". Simarlarly, the "dictionary-filename" should be replaced with "C:\06512-0001-Stata_dictionary.dct". The "stata-datafile" specification should be named with the specification for where you wish to store the Stata system file.
  2. INFILE COMMAND: Reads the columnar ASCII data into a Stata system file.
  3. VALUE LABEL DEFINITIONS: Defines descriptive labels for the individual values of each variable.
  4. MISSING VALUES: Replaces numeric missing values (i.e., -9) with generic system missing ".". By default the code in this section is commented out. Users wishing to apply the generic missing values should remove the comment at the beginning and end of this section. Note that Stata allows you to specify up to 27 unique missing value codes.
  5. SAVE OUTFILE: This section saves out a Stata system format file. There is no reason to modify it if the macros in Section 1 were specified correctly.