This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.
Census of Population and Housing, 2000 [United States]: Selected Subsets From Summary File 1, Advance National (ICPSR 13285)
Principal Investigator(s): United States Department of Commerce. Bureau of the Census; Inter-university Consortium for Political and Social Research
Prepared by the Inter-university Consortium for Political and Social Research, this data collection consists of selected subsets extracted from the Census of Population and Housing, 2000 [United States]: Summary File 1, Advance National (ICPSR 3325). Summary File 1 data contain information compiled from the questions asked of all people and of every housing unit enumerated in Census 2000: questions covering sex, age, race, Hispanic or Latino origin, type of living quarters (household/group quarters), household relationship, housing unit vacancy status, and housing unit tenure (owner/renter). The information is presented in 286 tables, which are tabulated for every case, i.e., every geographic unit represented in the data. There is one variable per table cell, plus additional variables with geographic information. All cases in the summary file data are classified by levels of observation, known as "summary levels," in the Census Bureau's nomenclature. These levels of observation served as the selection criteria for the subsets. Each subset comprises all of the cases in one of five summary levels: the nation (summary level 010), states (summary level 040), counties (summary level 050), places (summary level 160), and five-digit ZIP code tabulation areas (summary level 860). Three files are supplied for each subset except the last. There is a single, relatively large, file that contains all of the tables in the data, plus two smaller files, each of which contains approximately one half of the tables. For the five-digit ZIP code tabulation areas, there is only one file, which contains all of the tables.
These data are freely available.
WARNING: Because this study has many datasets, the download all files option has been suppressed, and you will need to download one dataset at a time.
WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.
United States Department of Commerce, Bureau of the Census, and Inter-university Consortium for Political and Social Research. CENSUS OF POPULATION AND HOUSING, 2000 [UNITED STATES]: SELECTED SUBSETS FROM SUMMARY FILE 1. ICPSR ed. Washington, DC: U.S. Dept. of Commerce, Bureau of the Census, and Ann Arbor, MI: Inter-university Consortium for Political and Social Research [producers], 2002. Ann Arbor, MI: Inter-university Consortium for Political and Social Research, [distributor], 2002. http://doi.org/10.3886/ICPSR13285.v1
Persistent URL: http://doi.org/10.3886/ICPSR13285.v1
This study was funded by:
- National Science Foundation (SES 0137019)
Scope of Study
Subject Terms: census data, ethnicity, household composition, housing, housing conditions, population
Geographic Coverage: United States
Date of Collection:
Universe: All persons and housing units in the United States.
Data Types: census/enumeration data
Data Collection Notes:
(1) The original Summary File 1, Advance National data comprise 40 files. There is one column-delimited file that contains geographic identifiers (the geographic header record file or "Geo" file), plus 39 comma-delimited table files, each with a subset of tables in the data. Initial steps in the production of the subsets for this collection involved sorting the Geo file and the 39 table files in ascending order of the common identification variable LOGRECNO, reformatting the Geo file as a comma-delimited file, and stripping the first five identification variables from each of the 39 table files (FILEID, STUSAB, CHARITER, CIFSN, and LOGRECNO). Next, the reformatted Geo file was merged with the stripped table files, end to end, so that corresponding records in the Geo and table files were joined as a single record in the merged file. Finally, each subset was generated by extracting from the merged file all cases with a given value for SUMLEV, the variable that identifies the summary level. Separate subsets were generated for summary levels 010, 040, 050, 160, and 860. (2) To allow for compatibility with SPSS (as of August 2002), subsets with a record length greater than the SPSS limit of 32,767 were "split" into two files, each with a record length less than the limit. Three files are supplied for each of these subsets: a "first half" file containing the Geo variables and tables P1-PCT12E, a "second half" file containing the Geo variables and tables PCT12F-H16I, and a complete file (for non SPSS use) that contains the Geo variables and all of the tables, P1-H16I. (3) Each subset contains all of the geographic component iterations in its summary level, if any. (4) The implied decimal places of variables INTPTLAT (latitude) and INTPTLON (longitude) were made explicit in the subsets. In addition, the values of all Geo variables were enclosed in quotes, except for variables AREALAND, AREAWATR, POP100, HU100, INTPTLAT, and INTPTLON. (5) The data definition statements were tested with SAS 8, SPSS 10, and Stata/SE 7.0. (5) The codebook is provided by the principal investigator as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site. (6) The codebook documents data collection procedures, concepts, and individual variables in the original Summary File data as well as the ICPSR-produced subsets, but not the layout and structure of the subsets. That information is contained in the data dictionary files provided with this collection. In particular, the "Data Structure and Segmentation" section in chapter 2 of the codebook and the variable locations shown in chapter 7 do not apply to the subsets. Every subset file record begins with the Geo variables in their original order. In a complete subset file, the Geo variables are followed by the 6th to last variables in table file 1, then the 6th to last variables in table file 2, and so on up to the 6th to last variables in table file 39. Each "first half" file is a subset of a complete file: it begins with the first variable in the Geo file and ends with the last variable in table file 20. In a "second half" file, the Geo variables are followed by the 6th to last variables in table file 21, then the 6th to last variables in table file 22, and so on up to the 6th to last variables in table file 39.
Original ICPSR Release: 2002-10-02
- 2006-01-18 File CB13285.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.
- Citations exports are provided above.
Export Study-level metadata (does not include variable-level metadata)
If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.