This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.

Census of Population and Housing, 2000 [United States]: Block Group Subset From Summary File 3 (ICPSR 13576) RSS

Principal Investigator(s):


Prepared by the Inter-university Consortium for Political and Social Research, the block group subset was extracted from the Census of Population and Housing, 2000, Summary File 3 (SF3). The SF3 data contain information compiled from the questions asked of a sample of persons and housing units enumerated in Census 2000. Population items include sex, age, race, Hispanic or Latino origin, household relationship, marital status, caregiving by grandparents, language and ability to speak English, ancestry, place of birth, citizenship status and year of entry to the United States, migration, place of work, journey to work, school enrollment, educational attainment, veteran status, disability, employment status, industry, occupation, class of worker, income, and poverty status. Housing items include housing unit vacancy status, housing unit tenure (owner/renter), number of rooms, number of bedrooms, year moved into unit, occupants per room, units in structure, year structure built, heating fuel, telephone service, plumbing and kitchen facilities, vehicles available, value of home, rent, and shelter costs. The information in SF3 is presented in 813 tables, one variable per table cell, plus additional variables with geographic information. However, only 409 of these tables are shown for the block group and higher levels of geography. The remaining 404 tables, which are shown for the census tract and higher levels of geography, were excluded from the block group subset. Cases in the summary file data are classified by levels of observation, known as "summary levels" in the Census Bureau's nomenclature. The block group subset comprises all of the cases in the SF3 data for summary level 150. Five data files are provided with this collection. There is a block group subset for each of the four census regions (Northeast, Midwest, South, and West), plus a national subset that covers all of the regions.

Series: Census of Population and Housing, 2000 [United States] Series

Access Notes

  • These data are freely available.


WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.

DS0:  Study-Level Files
DS1:  All Regions - Download All Files (3,065.1 MB) large dataset
DS2:  Northeast Only - Download All Files (627.5 MB) large dataset
DS3:  Midwest Only - Download All Files (771.1 MB) large dataset
DS4:  South Only - Download All Files (1,044 MB) large dataset
DS5:  West Only - Download All Files (650.2 MB) large dataset

Study Description


United States Department of Commerce. Bureau of the Census, and Inter-university Consortium for Political and Social Research. Census of Population and Housing, 2000 [United States]: Block Group Subset From Summary File 3. ICPSR13576-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2004. http://doi.org/10.3886/ICPSR13576.v1

Persistent URL:

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote XML (EndNote X4.0.1 or higher)


This study was funded by:

  • United States Department of Health and Human Services. National Institutes of Health (NIH: R01 HD42564)
  • National Science Foundation (NSF: SES 0137019)

Scope of Study

Subject Terms:   census data, demographic characteristics, ethnicity, household composition, housing, housing conditions, population

Geographic Coverage:   United States

Time Period:  

  • 2000

Date of Collection:  

  • 2000

Universe:   All persons and housing units in the United States.

Data Types:   census data

Data Collection Notes:

(1) The original SF3 data comprise 4,081 files. For the nation as a whole, every state, the District of Columbia, and Puerto Rico, there is one column-delimited file that contains geographic identifiers (the geographic header record file or "Geo" file), plus 76 comma-delimited table files, each of which contains a portion of the SF3 tables. For states, the District of Columbia, and Puerto Rico, the variables in the Geo file and table files 1-18 and 56-62 are shown down to the block group level, but the variables in table files 19-55 and 63-76 are only shown down to the census tract level. Consequently, table files 19-55 and 63-76 have fewer records than the Geo file and table files 1-18 and 56-62. In comparison, every one of 77 national files has the same number of cases. (2) The block group subset was produced one state at a time from the 4,004 state files (including the District of Columbia and Puerto Rico). Initial steps in the production of a subset for a state involved sorting its Geo file and table files 1-18 and 56-62 in ascending order of the common identification variable LOGRECNO, reformatting the Geo file as a comma-delimited file, and stripping off the first five identification variables from table files 1-18 and 56-62. Next, the reformatted Geo file was merged with the stripped table files so that corresponding records in the Geo and table files were joined as a single record in the merged file. A state subset was generated by extracting from the merged file all cases coded 150 for SUMLEV, the variable that identifies the summary level. After subsets were produced for every state, the national and regional subsets were generated by combining their component state subsets in ascending order of their state Federal Information Processing Standards (FIPS) codes. (3) The following states are included in the regional files. Northeast: Connecticut, Massachusetts, Maine, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont. Midwest: Iowa, Illinois, Indiana, Kansas, Michigan, Minnesota, Missouri, North Dakota, Nebraska, Ohio, South Dakota, and Wisconsin. South: Alabama, Arkansas, District of Columbia, Delaware, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia, and Puerto Rico. West: Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, New Mexico, Nevada, Oregon, Utah, Washington, and Wyoming. (4) The implied decimal places in variables INTPTLAT (latitude) and INTPTLON (longitude) in the original data files were made explicit in the subset files. Additionally, the values of all the Geo variables were enclosed in quotes in the subsets, except for variables AREALAND, AREAWATR, POP100, HU100, INTPTLAT, and INTPTLON. (5) The data definition statements were tested with SAS 8, SPSS 10, and Stata/SE 8. (6) The codebook documents data collection procedures, concepts, and individual variables in the original Summary File data as well as the ICPSR-produced subset files, but not the layout and structure of the subsets. That information is contained in the data dictionary file provided with this collection. (7) The codebook is provided by the principal investigators as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.


Sample:   Every person and housing unit in the United States was asked basic demographic and housing questions, for example, race, age, relationship to householder, housing unit vacancy status, and housing unit tenure. A sample of these people were asked more detailed questions about items such as income, occupation, and housing costs. The sampling unit for Census 2000 was the housing unit, including all occupants. There were four different housing unit sampling rates: 1-in-8, 1-in-6, 1-in-4, and 1-in-2 (designed for an overall average of about 1-in-6). The Census Bureau assigned these varying rates based on pre-census occupied housing unit estimates of various geographic and statistical entities, such as incorporated places and interim census tracts. For people living in group quarters or enumerated at long-form eligible service sites (shelters and soup kitchens), the sampling unit was the person and the sampling rate was 1-in-6.

Data Source:

self-enumerated questionnaires


Original ICPSR Release:  

Version History:

  • 2006-01-18 File ST13576.ALL was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.
  • 2006-01-18 File SP13576.ALL was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.
  • 2006-01-18 File SA13576.ALL was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.
  • 2006-01-18 File DD13576.ALL was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.
  • 2006-01-18 File CB13576.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.

Related Publications


Metadata Exports

If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.

Download Statistics

Found a problem? Use our Report Problem form to let us know.