Census of Population and Housing, 2000 [United States]: Selected Subsets From Summary File 1, States (ICPSR 13395)

Published: Jan 18, 2006

Principal Investigator(s):
United States Department of Commerce. Bureau of the Census; Inter-university Consortium for Political and Social Research



Version V1

Prepared by the Inter-university Consortium for Political and Social Research, this data collection consists of selected subsets extracted from the CENSUS OF POPULATION AND HOUSING, 2000 [UNITED STATES]: SUMMARY FILE 1, STATES (ICPSR 3194). Summary File 1 data contain information compiled from the questions asked of all people and of every housing unit enumerated in Census 2000: sex, age, race, Hispanic or Latino origin, type of living quarters (household/group quarters), household relationship, housing unit vacancy status, and housing unit tenure (owner/renter). The information is presented in 286 tables, one variable per table cell, plus additional variables with geographic information. Cases in the summary file data are classified by levels of observation, known as "summary levels" in the Census Bureau's nomenclature, which served as the selection criteria for the subsets. Each subset comprises all of the cases in one of two summary levels: whole census tracts (summary level 140) and census tracts in places (summary level 158). The latter covers whole tracts completely within places and portions of tracts that cross place boundaries. Five files are provided for each subset. There is a file for each of the four census regions (East, Midwest, South, and West) and a combined national file. Puerto Rico is included in the national and South files.

United States Department of Commerce. Bureau of the Census, and Inter-university Consortium for Political and Social Research. Census of Population and Housing, 2000 [United States]:  Selected Subsets From Summary File 1, States. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2006-01-18. https://doi.org/10.3886/ICPSR13395.v1

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote

National Science Foundation (SES 0137019)



(1) The original Summary File 1, States data comprise 2,080 files. For each state (District of Columbia and Puerto Rico included), there is one column-delimited file that contains geographic identifiers (the geographic header record file or "Geo" file), plus 39 comma-delimited table files, each with a subset of tables in the data. For reasons of confidentiality, table files 12-36 do not contain data below the tract level. Consequently, they have fewer records than the Geo file and table files 1-11 and 37-39. The subsets were produced piece by piece, one state at a time. Initial steps in the production of a subset for a state involved sorting its Geo file and 39 table files in ascending order of the common identification variable LOGRECNO, reformatting the Geo file as a comma-delimited file, removing records with data below the tract level from table files 1-11 and 37-39, and stripping off the first five identification variables from each of the 39 table files (FILEID, STUSAB, CHARITER, CIFSN, and LOGRECNO). Next, the reformatted Geo file was merged with the stripped table files so that corresponding records in the Geo and table files were joined as a single record in the merged file. The state subset was generated by extracting from the merged file all cases with a given value for SUMLEV, the variable that identifies the summary level. Separate subsets were generated for summary levels 140 and 158. After subsets were produced for every state, the national and regional files were compiled by concatenating component state subsets in ascending order of their state Federal Information Processing Standards (FIPS) codes. (2) The following states are included in the four regional files. Northeast: Connecticut, Massachusetts, Maine, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont. Midwest: Iowa, Illinois, Indiana, Kansas, Michigan, Minnesota, Missouri, North Dakota, Nebraska, Ohio, South Dakota, and Wisconsin. South: Alabama, Arkansas, District of Columbia, Delaware, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia, and Puerto Rico. West: Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, New Mexico, Nevada, Oregon, Utah, Washington, and Wyoming. (3) The implied decimal places in variables INTPTLAT (latitude) and INTPTLON (longitude) were made explicit in the subsets. In addition, the values of all Geo variables were enclosed in quotes, except for variables AREALAND, AREAWATR, POP100, HU100, INTPTLAT, and INTPTLON. (4) The data definition statements were tested with SAS 8, SPSS 10, and Stata/SE 7.0. (5) The codebook documents data collection procedures, concepts, and individual variables in the original Summary File data as well as the ICPSR-produced subsets, but not the layout and structure of the subsets. That information is contained in the data dictionary file provided with this collection. In particular, the "Data Structure and Segmentation" section in chapter 2 of the codebook and the variable locations shown in chapter 7 do not apply to the subsets. Every subset file record begins with the Geo variables in their original order. The Geo variables are followed by the 6th to last variables in table file 1, then the 6th to last variables in table file 2, and so on up to the 6th to last variables in table file 39. (6) The codebook is provided by the principal investigator as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.

All persons and housing units in the United States.

self-enumerated questionnaires

census data



2006-01-18 File CB13395.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.


  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

  • The citation of this study may have changed due to the new version control system that has been implemented.
ICPSR logo

This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.