Guide to Creating Artist Extracts and Special Tabulations of Artists from the American Community Survey
The National Archive of Data on Arts and Culture (NADAC) includes information about all artists in the American Community Survey in various years (2008-2012, 2010-2014, 2011-2015, and 2012-2016), along with special tabulations of artists drawn from the American Community Survey for 2015-2019 (ICPSR 38389). The special tabulations were compiled by the Census Bureau at the request of the National Endowment for the Arts (NEA) and show labor force estimates for detailed artist occupations for several levels of geographic coverage. This guide will assist interested users in subsetting the ACS to include only artists and replicating the special tabulations for more recent years.
About the American Community Survey
The American Community Survey (ACS), conducted by the U.S. Census Bureau, replaced the long form of the decennial census in 2000. The ACS allows researchers, policy makers, and others access to timely information about the U.S. population to make decisions about infrastructure and distribution of federal funds. The monthly survey is sent to a sample of approximately 3.5 million U.S. addresses, including the District of Columbia and Puerto Rico. The ACS includes questions on topics not included in the decennial census, such as those about occupations and employment, education, and key areas of infrastructure like internet access and transportation. Data from a single year can be used to make robust population-level estimates for larger geographic areas such as states, but estimates for smaller areas (or smaller segments of the population, such as specific occupations or race/ethnic groups) require data aggregated over a five-year period. More detailed information about the American Community Survey can be found at the Census Bureau’s ACS resource page.
Data Source
ACS data can be obtained from the Census Bureau and from IPUMS USA. IPUMS adds value to the data available from the Census Bureau by harmonizing the data across years and surveys and providing detailed documentation and tools to easily subset the data. IPUMS offers the data in both the annual and five-year increments. IPUMS’ User Guide includes information about properly weighting the data, computing and using complex variables, and adjusting dollar amounts over time. Anyone wishing to replicate the special tabulations of artists should refer to this documentation, especially the sections on occupation codes and weighting, as well as the Frequently Asked Questions. Guidance for ACS data users and additional technical documentation is available from the Census Bureau.
Identifying Artists in the ACS
Occupations (and industries) are included in the ACS as four-digit codes based on current classification schemes. Because these codes can change over time, the five-year datasets harmonize the codes to the ones in use at the time the data are released. For example, one coding scheme was in effect from 2000-2017 and another from 2018 through the present. Therefore the 2015-2019 file included occupation codes based on the 2018 scheme. IPUMS provides excellent documentation of occupation codes. The following are the occupations (OCC variable) included as “artists,” along with the occupation codes for 2018 to the present:
- architects (except naval): 1305
- landscape architects: 1306
- archivists, curators, and museum technicians: 2400
- artists and related workers (a.k.a., art directors, fine artists, and animators); 2600
- designers: 2631, 2632, 2633, 2634, 2634, 2636, 2640
- actors: 2700
- producers and directors: 2710
- dancers and choreographers: 2740
- musicians and singers: 2752
- music directors and composers: 2751
- entertainers and performers, sports and related works, all other: 2770
- writers and authors: 2850
- photographers: 2910
- television, video, and film camera operators and editors: 2920
- announcers: 2755, 2805, 2865.
Once you have selected the individuals with these occupation codes, you can choose any additional variables you wish to explore. The variables available at the person (individual) level are grouped by topics into demographics; race, ethnicity, and nativity; health insurance; education; work; income; poverty and socio-economic status; disability and veteran statuses; and characteristics of and travel to/from one’s work location.
Replicating the Special Tabulations
Structure of the Tables
The special tabulations are tables that show the (1) race and (2) sex of artists and the overall labor force for each of three levels of geography (the United States as a whole; the 50 states, Washington DC, and Puerto Rico; and the 25 largest metropolitan areas). This results in 12 tables total.
Defining race, sex, and membership in the labor force
These variables are described specifically to assist users in recreating the special tabulations available for 2015-2019. Many other variables are also available.
Race
The tables describing racial breakdowns of artists and all workers incorporate information from two variables, RACE and HISPAN, with resulting categories of Hispanic (of any race); White, non-Hispanic; Black or African American, non-Hispanic; Asian, non-Hispanic; and Other, non-Hispanic. The “other” race category includes non-Hispanic American Indian and Alaskan Natives, Native Hawaiians and other Pacific Islanders, those of some other race, and individuals reporting two or more races. The measurement of race and ethnicity (i.e., Hispanic origin) has changed over time. More information can be found at https://usa.ipums.org/usa-action/variables/RACE#description_section.
Sex
The SEX variable in the ACS denotes whether individuals are male or female.
Labor Force
For these special tabulations, individuals are counted as being in the labor force if they are aged 16 or older and were working or seeking work. Both employed and unemployed individuals are included. The LABFORCE variable can be used to identify those in the labor force.
Level of Aggregation (Geography)
The special tabulations for 2015-2019 are presented at three geographic levels: (1) the entire US; (2) individual states plus the District of Columbia and Puerto Rico; and (3) selected metropolitan areas. To recreate the tables, use the PWSTATE2 (Place of work: state) and PWMET13 (Place of work: metropolitan area [2013 delineations]) or PWPUMA00 (Place of work: PUMA, 2000 onward) variables for geography. For more detailed information about the geography variables, please see https://www.census.gov/programs-surveys/acs/technical-documentation/table-and-geography-changes.html or https://usa.ipums.org/usa/volii/tgeotools.shtml. The metropolitan areas and associated codes used in the 2015-2019 tables were:
- Atlanta-Sandy Springs-Alpharetta, GA Metro Area: 12060
- Baltimore-Columbia-Towson, MD Metro Area: 12580
- Boston-Cambridge-Newton, MA-NH Metro Area: 14460
- Charlotte-Concord-Gastonia, NC-SC Metro Area: 16740
- Chicago-Naperville-Elgin, IL-IN-WI Metro Area: 16980
- Dallas-Fort Worth-Arlington, TX Metro Area: 19100
- Denver-Aurora-Lakewood, CO Metro Area: 19740
- Detroit-Warren-Dearborn, MI Metro Area: 19820
- Houston-The Woodlands-Sugar Land, TX Metro Area: 26420
- Los Angeles-Long Beach-Anaheim, CA Metro Area: 31080
- Miami-Fort Lauderdale-Pompano Beach, FL Metro Area: 33100
- Minneapolis-St. Paul-Bloomington, MN-WI Metro Area: 33460
- New York-Newark-Jersey City, NY-NJ-PA Metro Area: 35620
- Orlando-Kissimmee-Sanford, FL Metro Area: 36740
- Philadelphia-Camden-Wilmington, PA-NJ-DE-MD Metro Area: 37980
- Phoenix-Mesa-Chandler, AZ Metro Area: 38060
- Portland-Vancouver-Hillsboro, OR-WA Metro Area: 38900
- Riverside-San Bernardino-Ontario, CA Metro Area: 40140
- St. Louis, MO-IL Metro Area: 41180
- San Antonio-New Braunfels, TX Metro Area: 41700
- San Diego-Chula Vista-Carlsbad, CA Metro Area: 41740
- San Francisco-Oakland-Berkeley, CA Metro Area: 41860
- Seattle-Tacoma-Bellevue, WA Metro Area: 42660
- Tampa-St. Petersburg-Clearwater, FL Metro Area: 45300
- Washington-Arlington-Alexandria, DC-VA-MD-WV Metro Area: 47900.
Creating the Tables
Once you have created a subset of data with the variables named above or using the IPUMS online analysis tool — you will be able to create a variable designating individuals as working in the arts and one denoting the individuals’ race/ethnicity to use in creating the tables. Example SPSS and Stata syntax for those variables is shown below, the logic is the same for coding in any of the other statistical packages. Running frequency and/or crosstabulation tables (e.g., artist occupation code by race) will give you tables similar to those included in ICPSR study 38389.
Example SPSS Code (note: “…” in the if statements is for the sake of brevity here, if you were creating the variable, you would need to include all codes):
*Comment: Creating “artist” variable.
Compute artist = 0.
If (OCC eq 1305 or OCC eq 1306 or OCC eq 2400 or OCC eq 2600 … or OCC eq 2865) artist = 1.
Value labels artist 0 ‘Not an artist’ 1 ‘Works in the arts’.
*Comment: Creating “raceethn” variable.
Compute raceethn=0.
Compute Hispanic=0.
If (HISPAN eq 1 or HISPAN eq 2 or HISPAN eq 3 or HISPAN eq 4) Hispanic=1.
If (Hispanic eq 1) raceethn=1.
If (Hispanic ne 1 and RACE eq 1) raceethn=2.
If (Hispanic ne 1 and RACE eq 2) raceethn=3.
If (Hispanic ne 1 and (RACE eq 4 or RACE eq 5)) raceethn=4.
If (Hispanic ne 1 and (RACE eq 3 or RACE eq 6 … or RACE eq 9)) raceethn=5.
Value labels raceethn 1 ‘Hispanic’ 2 ‘White’ 3 ‘Black or African American’ 4 ‘Asian’ 5 ‘Other race’.
Example Stata Code (note, some artist codes are left out for brevity here):
* Creating "artist" variable
gen artist = 0
replace artist = 1 if inlist(OCC, 1305, 1306, 2400, 2600, 2865)
label define artist_lbl 0 "Not an artist" 1 "Works in the arts"
label values artist artist_lbl
* Creating "raceethn" variable
gen raceethn = 0
gen Hispanic = 0
replace Hispanic = 1 if inlist(HISPAN, 1, 2, 3, 4)
replace raceethn = 1 if Hispanic == 1
replace raceethn = 2 if Hispanic != 1 & RACE == 1
replace raceethn = 3 if Hispanic != 1 & RACE == 2
replace raceethn = 4 if Hispanic != 1 & inlist(RACE, 4, 5)
replace raceethn = 5 if Hispanic != 1 & inlist(RACE, 3, 6, 9)
label define raceethn_lbl 1 "Hispanic" 2 "White" 3 "Black or African American" 4 "Asian" 5 "Other race"
label values raceethn raceethn_lbl