National Neighborhood Data Archive (NaNDA): Socioeconomic Status and Demographic Characteristics of Census Tracts and ZIP Code Tabulation Areas, United States, 1990-2020 (ICPSR 38528)

Version Date: Oct 2, 2024 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Philippa Clarke, University of Michigan. Institute for Social Research; Grace Noppert, University of Michigan. Institute for Social Research; Robert Melendez, University of Michigan. Institute for Social Research; Megan Chenoweth, University of Michigan. Institute for Social Research; Lindsay Gypin, University of Michigan. Institute for Social Research

Series:

https://doi.org/10.3886/ICPSR38528.v4

Version V4 ()

  • V5 [2025-01-22]
  • V4 [2024-10-02] unpublished
  • V3 [2023-04-17] unpublished
  • V2 [2022-09-27] unpublished
  • V1 [2022-09-15] unpublished

You are currently viewing an older version of this data collection. A more recent version may be available by selecting ()

Additional information about this collection can be found in Version History.

2024-10-02 Data and documentation for DS1 (2000-2010) were replaced with a new P.I. deposit covering 1990-2010. ICPSR codebooks for DS2, DS3, DS4, and DS5 were updated to the latest format, and epub codebooks were added.

2023-04-17 Data and documentation for 2016-2020 were added (DS4 and DS5). The study title and P.I. list were updated. Covers for all documentation were updated to reflect the revised title and P.I. list.

2022-09-27 ICPSR codebook and documentation covers were updated to correct spelling for a P.I. name.

2022-09-15 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Checked for undocumented or out-of-range codes.

Slide tabs to view more

These datasets contain measures of socioeconomic and demographic characteristics by US census tract for the years 1990-2020 and ZIP code tabulation area (ZCTA) for the years 2008-2020. Example measures include population density; population distribution by race, ethnicity, age, and income; income inequality by race and ethnicity; and proportion of population living below the poverty level, receiving public assistance, and female-headed or single parent families with kids. The datasets also contain a set of theoretically derived measures capturing neighborhood socioeconomic disadvantage and affluence, as well as a neighborhood index of Hispanic, foreign born, and limited English.

Clarke, Philippa, Noppert, Grace, Melendez, Robert, Chenoweth, Megan, and Gypin, Lindsay. National Neighborhood Data Archive (NaNDA): Socioeconomic Status and Demographic Characteristics of Census Tracts and ZIP Code Tabulation Areas, United States, 1990-2020. Inter-university Consortium for Political and Social Research [distributor], 2024-10-02. https://doi.org/10.3886/ICPSR38528.v4

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
United States Department of Health and Human Services. Administration for Community Living. National Institute on Disability, Independent Living, and Rehabilitation Research (90RTHF0001), United States Department of Health and Human Services. National Institutes of Health. National Institute on Aging (RF1-AG-057540), United States Department of Health and Human Services. National Institutes of Health. National Institute of Nursing Research (U01NR020556), United States Department of Health and Human Services. National Institutes of Health. National Center on Minority Health and Health Disparities (U01NR020556)

County Federal Information Processing System (FIPS)

Inter-university Consortium for Political and Social Research
Hide

1990 -- 2020
2018-01-01 -- 2020-12-31, 2022-01-01 -- 2022-12-31
  1. The data and documentation for the Neighborhood Socioeconomic and Demographic Characteristics of Census Tracts, United States, 2000-2010 Data were originally deposited in openICPSR 111107. Version 4 of this study includes data from 1990-2000 not included in the original deposit.

    The data and documentation for the Socioeconomic Status and Demographic Characteristics of Census Tracts, United States, 2008-2017 Data were originally deposited in openICPSR 119451.

    The data and documentation for the Socioeconomic Status and Demographic Characteristics of ZIP Code Tabulation Areas, United States, 2008-2017 Data were originally deposited in openICPSR 120462.

  2. A ZIP code to ZCTA crosswalk must be used to combine the ZCTA dataset with ZIP code geocoded data. Such a crosswalk is available on the UDS Mapper website at https://udsmapper.org/zip-code-to-zcta-crosswalk/. Sample code for merging the UDS Mapper crosswalk with NaNDA datasets is available at http://doi.org/10.3886/E124461.

    Users wanting to combine the 1990-2010, 2008-2017, and 2016-2020 data should be aware of key differences between how measures are calculated across the datasets. For more information, see the "Usage Note" section of each dataset's corresponding documentation file.

  3. For additional information see the National Neighborhood Data Archive (NaNDA).
Hide

These datasets were created to measure the socioeconomic and demographic characteristics by US census tract for the years 1990-2020 and ZIP code tabulation area (ZCTA) for the years 2008-2020.

Neighborhood Socioeconomic and Demographic Characteristics of Census Tracts, United States, 1990-2010 Data:

Construction of the Neighborhood Socioeconomic Disadvantage and Affluence Variables: To construct a set of variables that would characterize the sociodemographic structure of census tracts over time the research team conducted a principal factor analysis with an orthogonal varimax rotation of 10 census indicators (log transformed to correct positive skew) in 1990. The aim of the research team was to derive a parsimonious set of factors that capture the shared variance of a broad spectrum of neighborhood structural characteristics. Results from the factor analysis indicated 3 separate factors:

  • The first factor, which the research team interprets as neighborhood disadvantage, is characterized by high levels of poverty, unemployment, female-headed families, households receiving public assistance income, and a high proportion of African Americans in a census tract.

  • The second factor represents a mix of characteristics associated with neighborhood affluence (concentrations of adults with a college education; with incomes>75K; and employed in managerial and professional occupations). Distinguished from other non-disadvantaged census tracts by their large share of high income, highly-educated, adults in professional occupations, affluent census tracts are likely to attract a set of institutions (e.g., food stores, places to exercise, well-maintained buildings and parks) that foster a set of norms (e.g., an emphasis on exercise and healthy diets) conducive to good health (Clarke, Morenoff, Debbink, et al., 2014). Distinct from simply being the absence of neighborhood disadvantage, neighborhood affluence is associated with higher levels of social control and leverage over local institutions that can foster social environments that facilitate health (Browning & Cagney, 2003).

  • The third factor represents ethnic and immigrant concentration, (higher values indicate more Hispanic and foreign born in the census tract).

Socioeconomic Status and Demographic Characteristics of Census Tracts, United States, 2008-2017 Data:

To construct this dataset, the research team extracted key census indicators related to race, ethnicity, age, income level, employment, poverty, and home ownership from the American Community Survey (ACS) 2012 five-year estimate (covering 2008-2012). The research team merged the variables with the same variables from the ACS 2017 five-year estimate (covering 2013-2017) and with each tract's land area from the 2010 TIGER/Line shapefiles for census tracts. The research team then used those variables to construct three indices as described below: neighborhood disadvantage, neighborhood affluence, and ethnic immigrant concentration.

Construction of the index variables was informed by previous work to construct a set of variables that would characterize the sociodemographic structure of census tracts over time. In 2000, the research team conducted a principal factor analysis with an orthogonal varimax rotation of ten census indicators (log transformed to correct positive skew) (Morenoff et al., 2007). The aim of the research team was to derive a parsimonious set of factors that capture the shared variance of a broad spectrum of neighborhood structural characteristics. Results from the factor analysis indicated three separate factors:

  • The first factor, which the research team interprets as neighborhood disadvantage, is characterized by high levels of poverty, unemployment, female-headed families, households receiving public assistance income, and a high proportion of African Americans in a census tract.

  • The second factor represents a mix of characteristics associated with neighborhood affluence (concentrations of adults with a college education; with incomes>75K; and employed in managerial and professional occupations). Distinguished from other non-disadvantaged census tracts by their large share of high income, highly-educated, adults in professional occupations, affluent census tracts are likely to attract a set of institutions (e.g., food stores, places to exercise, well-maintained buildings and parks) that foster a set of norms (e.g., an emphasis on exercise and healthy diets) conducive to good health (Clarke et al., 2014). Distinct from simply being the absence of neighborhood disadvantage, neighborhood affluence is associated with higher levels of social control and leverage over local institutions that can foster social environments that facilitate health (Browning & Cagney, 2003).

  • The third factor represents ethnic and immigrant concentration. Higher values indicate more Hispanic and foreign born in the census tract.

Socioeconomic Status and Demographic Characteristics of ZIP Code Tabulation Areas, United States, 2008-2017 Data:

To construct this dataset, the research team extracted key census indicators related to race, ethnicity, age, income level, employment, poverty, and home ownership from the ACS 2012 five-year estimate (covering 2008-2012). The research team merged the variables with the same variables from the ACS 2017 five-year estimate (covering 2013-2017) and with each ZIP code tabulation area (ZCTA)'s land area from the 2010 TIGER/Line shapefiles for ZIP code tabulation areas. The research team then used those variables to construct three indices as described below: neighborhood disadvantage, neighborhood affluence, and ethnic immigrant concentration.

Construction of the index variables was informed by previous work to construct a set of variables that would characterize the sociodemographic structure of census tracts over time. In 2000, the research team conducted a principal factor analysis with an orthogonal varimax rotation of ten census indicators (log transformed to correct positive skew) (Morenoff et al., 2007). The aim of the research team was to derive a parsimonious set of factors that capture the shared variance of a broad spectrum of neighborhood structural characteristics. Results from the factor analysis indicated three separate factors:

  • The first factor, which the research team interprets as neighborhood disadvantage, is characterized by high levels of poverty, unemployment, female-headed families, households receiving public assistance income, and a high proportion of African Americans in a census tract.

  • The second factor represents a mix of characteristics associated with neighborhood affluence (concentrations of adults with a college education; with incomes>75K; and employed in managerial and professional occupations). Distinguished from other non-disadvantaged census tracts by their large share of high income, highly-educated, adults in professional occupations, affluent census tracts are likely to attract a set of institutions (e.g., food stores, places to exercise, well-maintained buildings and parks) that foster a set of norms (e.g., an emphasis on exercise and healthy diets) conducive to good health (Clarke et al., 2014). Distinct from simply being the absence of neighborhood disadvantage, neighborhood affluence is associated with higher levels of social control and leverage over local institutions that can foster social environments that facilitate health (Browning & Cagney, 2003).

  • The third factor represents ethnic and immigrant concentration. Higher values indicate more Hispanic and foreign born in the census tract.

Socioeconomic Status and Demographic Characteristics of Census Tract and ZIP Code Tabulation Areas, United States, 2016-2020 Data:

To construct this dataset, the research team extracted key census indicators related to race, ethnicity, age, income, employment, poverty, and home ownership from the ACS 2020 five-year estimate (covering 2016-2020). The team merged variables with each tract's or ZCTA's land area from the 2020 TIGER/Line shapefiles for census tracts and ZCTAs. The team then conducted a principal components analysis, exploratory factor analysis, and confirmatory factor analysis with census tract indicators from the 2016-2020 ACS 5-year estimates to empirically re-evaluate the neighborhood socioeconomic and demographic indices from previous versions of the data. The aim was to provide a parsimonious set of theoretically-derived factors that capture the shared variance across a broad spectrum of structural socioeconomic characteristics. Results from the factor analysis indicated three separate factors:

  • The first factor, which the team interprets as neighborhood disadvantage, is characterized by high levels of poverty, low family income, and households receiving public assistance income in the neighborhood.

  • The second factor, which the team interprets as neighborhood affluence, is characterized by high levels of people with a college education, families with high income, and people employed in professional/managerial occupations in the neighborhood.

  • The third factor represents a higher proportion of Hispanic, foreign born, and people with limited English proficiency in the neighborhood.

Longitudinal

Census tracts and ZIP code tabulation areas in the United States, including Puerto Rico.

Geographic Unit

Census tract-level measures, 1990-2000: Data for 1990 are from the 1990 Census of Population and Housing which is an extraction of the 1990 Decennial Census. Annual measures for the intervening years (1991-1999 and 2001-2009) are interpolated using a linear interpolation between the 1990 and 2000 data sources and the 2000 and 2010 data sources. Census tract boundaries were normalized to the 2010 tract boundaries using the Longitudinal Tract Data Base described in Logan, Xu, and Stults 2014 (http://dx.doi.org/10.1080/00330124.2014.905156).

The data and documentation for the Neighborhood Socioeconomic and Demographic Characteristics of Census Tracts, United States, 2000-2010 Data were originally deposited in openICPSR 111107. Version 4 of this study includes data from 1990-2000 not included in the original deposit.

The data and documentation for the Socioeconomic Status and Demographic Characteristics of Census Tracts, United States, 2008-2017 Data were originally deposited in openICPSR 119451.

The data and documentation for the Socioeconomic Status and Demographic Characteristics of ZIP Code Tabulation Areas, United States, 2008-2017 Data were originally deposited in openICPSR 120462.

Census tract-level measures, 2000-2010: Population values and percentage variables per census tract are from the 2000 Census of Population and Housing Summary and the American Community Survey five-year estimate for 2008-2012. Census tract boundaries were normalized to the 2010 tract boundaries using the Longitudinal Tract Data Base described in Logan, Xu, and Stults 2014 (http://dx.doi.org/10.1080/00330124.2014.905156).

ZCTA-level measures, 2008-2017: Population values and demographic proportions per ZCTA are from the American Community Survey five-year estimates for 2008-2012 and 2013-2017. ZCTA boundaries are based on the 2019 version of the US TIGER/Line shapefiles.

Census tract-level measures, 2008-2017: Population values and percentage variables per census tract are from the American Community Survey five-year estimates for 2008-2012 and 2013-2017. Both are based on census tract boundaries as of 2010. Land area for each census tract comes from the TIGER/Line shapefiles, 2010 version.

Census tract- and ZCTA-level measures, 2016-2020: Population values and proportion variables are from the American Community Survey five-year estimates for 2016-2020. Land area for each census tract/ZCTA comes from the TIGER/Line shapefiles, 2020 version.

Hide

2022-09-15

2024-10-02 Data and documentation for DS1 (2000-2010) were replaced with a new P.I. deposit covering 1990-2010. ICPSR codebooks for DS2, DS3, DS4, and DS5 were updated to the latest format, and epub codebooks were added.

2023-04-17 Data and documentation for 2016-2020 were added (DS4 and DS5). The study title and P.I. list were updated. Covers for all documentation were updated to reflect the revised title and P.I. list.

2022-09-27 ICPSR codebook and documentation covers were updated to correct spelling for a P.I. name.

2022-09-15 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Checked for undocumented or out-of-range codes.

Hide