Which Test When? - a brief guide to choosing an appropriate method of analysis
When presenting data, it is often desirable to perform a formal test for statistical significance. However, in general this should be as a compliment to other details, such as summary statistics (e.g. a population mean) and appropriate graphical display. Below we outline some basic recommendations for what we believe should appear when publishing or presenting research and an indication of suitable preliminary analyses.
Analysis of two categorical variables
Where an association is sought between two factors or variables that may take only a small number of discrete values
| Nature of data available | Some appropriate summary statistics | Test statistic |
|---|---|---|
| Reasonable Sample | Numbers and percentages, difference in percentages or ratio of percentages (rate ratio) | Chi-square |
| As above, but where one of the factors has several categories and is ordered (e.g. clinical stage) | Numbers and percentages | Chi-square test for trend |
| Small Sample | As above | Fishers exact |
Analysis of a single continuous variable
Where the mean or median of a single measurement is required, along with an idea of how precise and estimate has been derived, or what is 'statistically plausible'.
| Nature of data available | Some appropriate summary statistics | Test statistic |
|---|---|---|
| Normally Distributed | Mean, standard deviation/error confidence interval | T-test mean equals zero (or some value) |
| Severely non-Normally distributed | a) Transform (e.g. logarithm) to make Normal, then as above with transformed data b) Median, range/inter quartile range, confidence interval |
As above Binomial probability |
Comparison of a continuous variable between groups
Where the same measure has been taken from two groups, and it is wished to assess whether the measure differs between them
| Nature of data available | Some appropriate summary statistics | Test statistic |
|---|---|---|
| Data form pairs (e.g. before/after) | Calculate differences between pairs, then analyse differences as in previous table | Paired T-test |
| Not paired (two samples), equal standard deviation | Difference in means with confidence interval | (two sample) T-test |
| Not paired (two samples), unequal standard deviation | a) Difference in means with confidence interval b) Difference in medians with confidence interval |
Fishers exact |
Where more than two groups are present, alternative approaches to the analysis include:
a) Unpaired, several groups: ANOVA instead of t-test, Kruskai-Wallis test instead of Mann-Whitney
b) Paired, several groups (e.g. measurements over time): repeated measures ANOVA
Relationship between two continuous variables
Where an association is sought between two factors or variables that may take on any number of values
| Nature of data available | Some appropriate summary statistics | Test statistic |
|---|---|---|
| Two measurements on the same individual (e.g. age and weight) | Strength of (linear) association: Correlation coefficients, or Spearman correlation coefficient if non-Normal Magnitude of association: Linear regression |
Test whether equal to zero Test whether equal to zero |
Analysis of survivorship between two groups
Where the probability of surviving 'event-free' is to be measured over time, or is to be compared between groups
| Nature of data available | Some appropriate summary statistics | Test statistic |
|---|---|---|
| Time to an event (e.g. relapse, death) | Kaplan-Meier survival estimate, N-month/N-year survival probability, median survival time | Logrank test |

