How to Choose Which Statistical Test to Use
If you have questions and the data to address them but aren’t sure where to go from there, this guide can help. Find the type of question you want to answer – for example, “how many” or “are these two things related” – in the first column to see the recommended statistical analysis. The data requirements for each test are included.
Based on Question and Variable Types
Use the first table below to find the question that most closely matches what you want to know, and look for the appropriate statistical test on the right! The second table provides more information specific to each method of analysis. Note: This represents many of the most used statistical tests but is not meant to be an exhaustive list.
Reminders
Some statistical tests require variables to be measured at a specific level of measurement. Depending how a question is asked, variables can be at one of four levels:
- nominal or categorical: answers are unordered categories, like a list of cities.
- ordinal: answer categories have an order but the categories are of unequal sizes or are “squishy,” such as Likert scales. In some disciplines, it is common to use these variables as if they were continuous.
- interval: answers are actual numbers but there is no true zero, IQ scores or temperature in Fahrenheit are examples.
- ratio: answers are actual numbers with a true zero, such as dollar amounts or number of days.
Note: Interval and ratio are often grouped together (I/R) and called continuous.
Table 1. Statistical Test by Question
Question | Statistical Test | Notes |
---|---|---|
How many people answered a question this way? Or How did people answer this question? | Frequency table | If there are a lot of unique values, a frequency table may be inefficient and summary statistics (range, mean, mode, etc.) should be used instead. |
What answer was the most common? | Mode (or frequency table) | Applicable for variables at all levels. There may be multiple modes for a single question. |
What is the “average” value of this variable? | Median or Mean | Use median with ordinal-level variables or those whose distributions have significant outliers. Use mean with continuous variables. |
How large is the spread of respondents’ answers and how much do they differ from one another? | Range and Standard Deviation | Both require at least ordinal-level data, standard deviation technically requires continuous. |
Is a score one point in time related to a later score on the same item? | Paired-samples T-test | – |
Does group one differ from group two on some continuous variable? (e.g., Do men and women differ on the number of meals they eat out each week?) | Independent-samples T-test | – |
Do members of three or more groups differ from each other on some continuous measure? (e.g., Do individuals with a high school degree watch more tv than those with some college or those with a college degree?) | Oneway ANOVA | – |
Are these two concepts related? | Chi-square or Correlation | Chi-square can be used with nominal or ordinal data, correlation requires two continuous variables. Neither require identifying an independent and a dependent variable. |
Is my independent variable related to my dependent variable? (e.g., Is the amount of sleep I get at night related to how many minutes I exercise the next day?) | Bivariate Regression | Need to specify which variable is independent and which is dependent. Both are typically at least ordinal, or “dummy” (dichotomous yes/no) variables. |
Does this group of variables predict my outcome of interest? | Multiple Regression | The type of regression depends on what you are trying to predict or explain. Two common types are ordinary least squares (OLS, when the outcome is continuous) and logistic (“yes/no” type outcome). |
Table 2. Tests of Significant Differences
Test | Variable Types Needed | Null Hypothesis | Alternative Hypothesis | Test Statistic | Effect Size Statistic |
---|---|---|---|---|---|
One-Sample T-test | I/R (sample mean, population mean) | The means are the same | The means are different (two-tailed) | T-test | Cohen’s D |
Independent-Samples T-test | Independent: Nominal (2 groups) Dependent: Ordinal, I/R | The two samples are drawn from the same population | The populations from which they are drawn are different (two-tailed) | Levine’s test (equal variances vs unequal variances) T-test | Cohen’s D |
Dependent (Paired)-Samples T-test | Independent: Person or pair Dependent: Ordinal, I/R | No change from T1 to T2 | Change from T1 to T2 | T-test | Cohen’s D |
Oneway ANOVA | Independent: Nominal Dependent: Ordinal, I/R | The means for all categories are equal | The mean for at least one category is different | F-ratio Tukey’s HSD (post-hoc tests – which pairs are different) | Eta-squared |
Chi-Square | Var1: Nominal, Ordinal (few categories) Var2: Nominal, Ordinal (few categories) | Variables are unrelated in population | The variables are related | Chi-sq. | Eta |
Correlation | Var1: Ordinal, I/R Var2: Ordinal, I/R | No relationship between the variables | The variables are related | Pearson r | R-squared |
Bivariate Regression | Independent: Ordinal, I/R Dependent: Ordinal, I/R | IV is not related to DV (b = 0) | IV is related to DV (b≠ 0) | T-test | R-squared |
Multiple Regression | Independent(s): Ordinal, I/R, dummy vars Dependent: Ordinal, I/R | Independent Variables are not related to the dependent variables (all slopes = 0) | At least one slope is not 0 (IV[s] are related to DVs) | F-ratio (combination) T-test | R-squared |