National Neighborhood Data Archive (NaNDA): Socioeconomic Status and Demographic Characteristics of Census Tracts and ZIP Code Tabulation Areas, United States, 1990-2022 (ICPSR 38528)

Version Date: Jan 22, 2025 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Philippa Clarke, University of Michigan. Institute for Social Research; Robert Melendez, University of Michigan. Institute for Social Research; Grace Noppert, University of Michigan. Institute for Social Research; Megan Chenoweth, University of Michigan. Institute for Social Research; Lindsay Gypin, University of Michigan. Institute for Social Research

Series:

https://doi.org/10.3886/ICPSR38528.v5

Version V5 ()

  • V5 [2025-01-22]
  • V4 [2024-10-02] unpublished
  • V3 [2023-04-17] unpublished
  • V2 [2022-09-27] unpublished
  • V1 [2022-09-15] unpublished
Slide tabs to view more

These datasets contain measures of socioeconomic and demographic characteristics by U.S. census tract for the years 1990-2022 and ZIP code tabulation area (ZCTA) for the years 2008-2022. Example measures include population density; population distribution by race, ethnicity, age, and income; income inequality by race and ethnicity; and proportion of population living below the poverty level, receiving public assistance, and female-headed or single parent families with kids. The datasets also contain a set of theoretically derived measures capturing neighborhood socioeconomic disadvantage and affluence, as well as a neighborhood index of Hispanic, foreign born, and limited English.

Clarke, Philippa, Melendez, Robert, Noppert, Grace, Chenoweth, Megan, and Gypin, Lindsay. National Neighborhood Data Archive (NaNDA): Socioeconomic Status and Demographic Characteristics of Census Tracts and ZIP Code Tabulation Areas, United States, 1990-2022. Inter-university Consortium for Political and Social Research [distributor], 2025-01-22. https://doi.org/10.3886/ICPSR38528.v5

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
United States Department of Health and Human Services. Administration for Community Living. National Institute on Disability, Independent Living, and Rehabilitation Research (90RTHF0001), United States Department of Health and Human Services. National Institutes of Health. National Institute on Aging (RF1-AG-057540), United States Department of Health and Human Services. National Institutes of Health. National Institute of Nursing Research (U01NR020556), United States Department of Health and Human Services. National Institutes of Health. National Center on Minority Health and Health Disparities (U01NR020556)

County Federal Information Processing System (FIPS)

Inter-university Consortium for Political and Social Research
Hide

1990 -- 2022
2018-01-01 -- 2020-12-31, 2022-01-01 -- 2022-12-31, 2024-01-01 -- 2024-12-31
  1. The data and documentation for the Neighborhood Socioeconomic and Demographic Characteristics of Census Tracts, United States, 2000-2010 Data were originally deposited in openICPSR 111107. The current version of this study includes data from 1990-2000 not included in the original deposit.

    The data and documentation for the Socioeconomic Status and Demographic Characteristics of Census Tracts, United States, 2008-2017 Data were originally deposited in openICPSR 119451.

    The data and documentation for the Socioeconomic Status and Demographic Characteristics of ZIP Code Tabulation Areas, United States, 2008-2017 Data were originally deposited in openICPSR 120462.

  2. A ZIP code to ZCTA crosswalk must be used to combine the ZCTA dataset with ZIP code geocoded data. A crosswalk and sample code for merging the crosswalk with National Neighborhood Data Archive (NaNDA) datasets are available in the ICPSR Linkage Library.

    Data users wanting to combine data from each time period should be aware of key differences between how measures are calculated across the datasets. For more information, see the "Usage Note" section of each dataset's corresponding documentation file.

  3. For additional information, see the National Neighborhood Data Archive (NaNDA).
Hide

These datasets were created to measure the socioeconomic and demographic characteristics by U.S. census tract for the years 1990-2022 and ZIP code tabulation area (ZCTA) for the years 2008-2022.

For each period covered in the data, the research team extracted key census indicators related to race, ethnicity, age, income level, employment, poverty, and home ownership from the American Community Survey (ACS). Next, the team merged the variables with each tract's/ZIP code tabulation area (ZCTA)'s land area from TIGER/Line shapefiles for census tracts. The team then used those variables to construct three indices: neighborhood disadvantage, neighborhood affluence, and ethnic and immigrant concentration. Specific years for the ACS estimates and TIGER/Line shapefiles used are listed below:

  • 2008-2017 data: ACS 2012 five-year estimate (covering 2008-2012), merging these variables with ACS 2017 five-year estimate (covering 2013-2017) and with 2010 TIGER/Line shapefiles for census tracts/ZCTAs

  • 2016-2020 data: ACS 2020 five-year estimate (covering 2016-2020), merging these variables with 2020 TIGER/Line shapefiles for census tracts/ZCTAs

  • 2018-2022 data: ACS 2022 five-year estimate (covering 2018-2022), merging these variables with 2010 and 2020 TIGER/Line shapefiles for census tracts/ZCTAs

Index creation was informed by work to construct a set of variables that would characterize the sociodemographic structure of census tracts over time. In 1990, the research team conducted a principal factor analysis with an orthogonal varimax rotation of 10 census indicators (log transformed to correct positive skew). This factor analysis was repeated in 2000 (Morenoff et al., 2007). The aim was to derive a parsimonious set of factors that capture the shared variance of a broad spectrum of neighborhood structural characteristics. Results from the factor analysis indicated 3 separate factors:

  • The first factor, which the research team interprets as neighborhood disadvantage, is characterized by high levels of poverty, unemployment, female-headed families, households receiving public assistance income, and a high proportion of African Americans in a census tract.

  • The second factor represents a mix of characteristics associated with neighborhood affluence (concentrations of adults with a college education; with incomes >$75K; and employed in managerial and professional occupations).

  • The third factor represents ethnic and immigrant concentration. Higher values indicate more Hispanic and foreign born in the census tract.

For the 2016-2020 analysis, the team conducted a principal components analysis, exploratory factor analysis, and confirmatory factor analysis with census tract indicators from the 2016-2020 ACS 5-year estimates to empirically reevaluate the neighborhood socioeconomic and demographic indices from previous versions of the data. Results from the 2016-2020 factor analysis indicated these factors, which were also used for the 2018-2022 data:

  • Neighborhood disadvantage factor, characterized by high levels of poverty, low family income, and households receiving public assistance income in the neighborhood.

  • Neighborhood affluence factor, characterized by high levels of people with a college education, families with high income, and people employed in professional/managerial occupations in the neighborhood.

  • Ethnic and immigrant concentration, representing a higher proportion of Hispanic, foreign born, and people with limited English proficiency in the neighborhood.

Longitudinal

Census tracts and ZIP code tabulation areas in the United States, including Puerto Rico.

Geographic Unit

Census tract- and ZIP code tabulation area (ZCTA)-level measures, 2008-2017: Population values and percentage variables per census tract are from the American Community Survey five-year estimates for 2008-2012 and 2013-2017. Both are based on census tract boundaries as of 2010. Land area for each census tract comes from the TIGER/Line shapefiles, 2010 version. ZCTA boundaries are based on the 2019 TIGER/Line shapefiles.

Census tract-level measures, 1990-2000: Data for 1990 are from the 1990 Census of Population and Housing which is an extraction of the 1990 Decennial Census. Annual measures for the intervening years (1991-1999 and 2001-2009) are interpolated using a linear interpolation between the 1990 and 2000 data sources and the 2000 and 2010 data sources. Census tract boundaries were normalized to the 2010 tract boundaries using the Longitudinal Tract Data Base described in Logan, Xu, and Stults 2014 (http://dx.doi.org/10.1080/00330124.2014.905156).

The data and documentation for the Neighborhood Socioeconomic and Demographic Characteristics of Census Tracts, United States, 2000-2010 Data were originally deposited in openICPSR 111107. Version 4 of this study includes data from 1990-2000 not included in the original deposit.

The data and documentation for the Socioeconomic Status and Demographic Characteristics of Census Tracts, United States, 2008-2017 Data were originally deposited in openICPSR 119451.

The data and documentation for the Socioeconomic Status and Demographic Characteristics of ZIP Code Tabulation Areas, United States, 2008-2017 Data were originally deposited in openICPSR 120462.

Census tract-level measures, 2000-2010: Population values and percentage variables per census tract are from the 2000 Census of Population and Housing Summary and the American Community Survey five-year estimate for 2008-2012. Census tract boundaries were normalized to the 2010 tract boundaries using the Longitudinal Tract Data Base described in Logan, Xu, and Stults 2014 (http://dx.doi.org/10.1080/00330124.2014.905156).

Census tract- and ZCTA-level measures, 2018-2022: Population values and proportion variables are from the American Community Survey five-year estimates for 2018-2022. Land area for each census tract/ZCTA comes from the 2010 and 2020 TIGER/Line shapefiles.

Census tract- and ZCTA-level measures, 2016-2020: Population values and proportion variables are from the American Community Survey five-year estimates for 2016-2020. Land area for each census tract/ZCTA comes from the TIGER/Line shapefiles, 2020 version.

Hide

2022-09-15

2025-01-22 Data and documentation for 2018-2022 were added (DS6-DS8). The study title and P.I. list were updated. Data fileset names were updated to specify census year geography. ICPSR and PI documentation files were updated to reflect the revised title, P.I. list, and fileset names.

2024-10-02 Data and documentation for DS1 (2000-2010) were replaced with a new P.I. deposit covering 1990-2010. ICPSR codebooks for DS2, DS3, DS4, and DS5 were updated to the latest format, and epub codebooks were added.

2023-04-17 Data and documentation for 2016-2020 were added (DS4 and DS5). The study title and P.I. list were updated. Covers for all documentation were updated to reflect the revised title and P.I. list.

2022-09-27 ICPSR codebook and documentation covers were updated to correct spelling for a P.I. name.

2022-09-15 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Checked for undocumented or out-of-range codes.

Hide

Notes

  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.