Exercise 1. Party Identification and the Presidential Vote
Step A. Create and interpret Table 1A
Party identification usually is strongly related to voting behavior. This relationship is discussed in the section on voting behavior. To see how strongly party identification was related to the 2016 presidential vote, you should create Table 1A that shows the relationship between the individual’s party identification (A07) and his or her presidential vote (A02).
To create a two-variable table, you must specify a row variable and a column variable. A common way of setting up a table is to put the independent variable on the top of the table (this is the column variable) and the dependent variable on the side of the table (this is the row variable). For a discussion of independent and dependent variables, see the section on principles of data analysis. When you are in the SDA crosstabulation program, enter your variables in the row and column dialog boxes.
If the table is set up in the above fashion, you normally would want percentages by columns. You should click this option under table options.
One could construct the table so that the independent variable is the row variable (on the side of the table) and the dependent variable is the column variable (on the top of the table). If the table is set up in this manner, then you normally would want percentages by rows. However, in all of the tables in this module, we have followed the convention of having the independent variable as the column variable and the dependent variable as the row variable.
You should be sure that you have the weight on and that you have selected the weighted Ns to appear in the table. If you do not have the weight on, then you will be analyzing the unweighted data, which will not be a representative sample of the American electorate. See the discussion of weighting the data in the section on survey research methods for an explanation of how and why the data are weighted.
Because the data are weighted, which means that individual respondents may count as more or less than one person (e.g., as .75 or 1.35 persons), the number of respondents in each cell (the Ns) probably will not be whole numbers. If you prefer to have Ns that are whole numbers, you can revise the output to do that by using the “revise the display” options that appears to the left of the table that you generated.
If statistics are desired, that option should be checked under table options. For a discussion of the statistics that are commonly used for contingency tables, see the section on data analysis. In these exercises, we have not asked you to generate statistics, but your instructor may suggest doing so.
The SDA crosstabulation program will produce both a table and a chart, but the chart is not necessary, as all of the information that you need will be contained in the table that you generate. You can revise the output to drop the chart if you like by using the “revise the display” option.
If you ran Table 1A as suggested, you should have a table with seven columns and three rows. Party identification (the independent variable) should be on the top of the table (the column variable), and presidential vote (the dependent variable) should be on the side of the table (the row variable). Percentages should be calculated by column (i.e., they should sum to 100% for each column). In reading your table, take care to properly interpret the percentages, remembering that they are column percentages, not row percentages.
Note: You can check the table that you generated against the example in the data analysis section. The two tables should have the same figures.
You should attempt to answer these questions to see if you are able to correctly read the table and interpret the data:
- Overall, what percentage of respondents in the table voted for Clinton? What percentage voted for Trump? How do these percentages compare with the national figures obtained from actual vote counts (refer to the section on the 2016 election for the election results)? What might explain any differences that you find?
- How many respondents in the table were in each of the seven categories of party identification? Do these numbers match the numbers in the codebook? If not, why not? (Hint: who is not included in Table 1A?)
- What percentage of the respondents who were strong Democrats voted for Clinton?
What percentage voted for Trump?
How do these percentages compare with the percentages voting for Clinton and Trump in the other categories?
How would you describe the overall relationship between these two variables?
Step B. Generate a simpler table
Table 1A that you just generated may be a little difficult to interpret because:
- Party identification has seven categories. This makes the table complicated. If party identification were simplified so that it had fewer categories, the table would be less complicated.
- The presidential vote includes those who voted for minor party candidates. Since so few people voted for minor candidates, it would simplify the table to eliminate them.
General information about why it can be desirable to recode variables is in the section on data analysis.
Create Table 1B, a simpler table by using recoded versions of party identification and presidential vote.
It often is desirable to simplify the party identification variable so that it has fewer than seven categories. Most often, this is done by recoding the variable to produce three categories: Democrats, independents, and Republicans. There are two possible ways to create these three categories:
- We can combine everyone who indicates some attachment to the Democratic Party (strong Democrats, weak Democrats, and independent Democrats) into one category. The same can be done for those who have some attachment to the Republican Party. This will yield three groups: Democrats, independents, and Republicans. The independents in this case will be only the pure independents—those who do not lean toward one of the parties.
- An alternative method is to combine strong and weak Democrats into one group, strong and weak Republicans into another group, and all independents (including those who lean toward one of the parties) into a third group. This also yield three groups: Democrats, independents, and Republicans. In this case, more respondents will be included in the independent category and fewer in the Democratic and Republican categories.
We favor using the first method to recode party identification into three categories, as we feel that those who lean toward one of the parties are often not much different from those who weakly identify with a party. As we can see from Table 1A, they are similar in their presidential voting. However, some analysts would disagree with this recoding and would favor the alternative method described above. In the analysis exercises, we have assumed that the first method of recoding party identification is used wherever you are asked to recode the variable. Somewhat different results will be obtained if the second method is used for recoding.
Relatively few respondents voted for a presidential candidate other than Clinton or Trump. Only 7 percent of the survey respondents voted for a minor party candidate, and even that figure is higher than the actual vote for minor party candidates, which was about 6 percent. Including these minor party voters in our tables makes the tables a little more complex and does not add much useful information. If our concern is understanding why people chose Clinton or Trump, it probably is desirable to drop the minor party voters from our analysis. This can be accomplished by defining category 3 of A02 as missing data, which can be done through the recoding procedure.
SDA contains detailed instructions, including examples, on how to use the program to recode a variable.
To recode a variable, enter the recoding instructions in the recoding dialog box that appears when you click on the recode link next to the dialog box where you enter the variable name in the crosstabs builder.
The basic syntax for recoding is simple. Begin with an “r:”, then specifiy which new values (those on your new, recoded variable) should be equal to which old values (those on the original variable). For example, to recode party identification (A07) in the manner suggested above, the recode syntax should look like this: r: 1=1-3; 2=4; 3=5-7. This will create a new value of “1” that will be composed of old values 1, 2, and 3 (strong, weak, and independent Democrats); a new value of 2 that will be equal to old value 4 (independents); and a new value of 3 that will be equal to old values 5, 6, and 7 (independent, weak, and strong Republicans).
If you want to drop some categories in the original variable, just do not include them in the recode statement. For example, to recode presidential vote (A02) to exclude minor party voters, specify the following recode: r: 1=1; 2=2. This drops the value of “3” on the original variable, leaving only the Clinton and Trump voters.
When you recode a variable, it usually is helpful to attach labels to the new values. To do this, simply add the new label in quotation marks after the statement that indicates what new value equals what old values. For example, the following recode specification will add labels to the recoded version of party identification described above: r: 1=1-3 “Democrat”; 2=4 “Independent”; 3=5-7 “Republican”.
You may intend to generate a number of tables with the same variable. To avoid having to enter the recode specification each time that you want to run a table with the variable, you can copy the recode statement to some location (e.g., a Word document) and paste it to the recode syntax window for subsequent table constructions that use the variable.
Step C. Interpret Table 1B
If you ran Table 1B as suggested, you should have a table with three columns and two rows. Your recoded party identification (the independent variable) should be on the top of the table (the column variable), and your recoded presidential vote (the dependent variable) should be on the side of the table (the row variable). Percentages should be calculated by column (i.e., they should sum to 100% for each column). In reading your table, take care to properly interpret the percentages, remembering that they are column percentages, not row percentages.
You should attempt to answer these questions to see if you are able to correctly read the table and interpret the data:
- Overall, what percentage of respondents in the table voted for Clinton? What percentage voted for Trump? How do these percentages compare with those from Table 1A? What explains the difference? (Note: Table 1B presents the two-party or major-party vote for president, which political scientists find useful to use.)
- Is Table 1B easier to interpret than Table 1A? Is the relationship clearer to you when you look at Table 1B as compared to Table 1A?
- What is lost when you look at Table 1B as compared to Table 1A? Are there reasons why you might prefer to have Table 1A rather than Table 1B?