Which Test When? - a brief guide to choosing an appropriate method of analysis


When presenting data, it is often desirable to perform a formal test for statistical significance. However, in general this should be as a compliment to other details, such as summary statistics (e.g. a population mean) and appropriate graphical display. Below we outline some basic recommendations for what we believe should appear when publishing or presenting research and an indication of suitable preliminary analyses.

Analysis of two categorical variables

Where an association is sought between two factors or variables that may take only a small number of discrete values

Nature of data available Some appropriate summary statistics Test statistic
Reasonable Sample Numbers and percentages, difference in percentages or ratio of percentages (rate ratio) Chi-square
As above, but where one of the factors has several categories and is ordered (e.g. clinical stage) Numbers and percentages Chi-square test for trend
Small Sample As above Fishers exact

 

 


Analysis of a single continuous variable

Where the mean or median of a single measurement is required, along with an idea of how precise and estimate has been derived, or what is 'statistically plausible'.

Nature of data available Some appropriate summary statistics Test statistic
Normally Distributed Mean, standard deviation/error confidence interval T-test mean equals zero (or some value)
Severely non-Normally distributed a) Transform (e.g. logarithm) to make Normal, then as above with transformed data

b) Median, range/inter quartile range, confidence interval
As above

Binomial probability


Comparison of a continuous variable between groups

Where the same measure has been taken from two groups, and it is wished to assess whether the measure differs between them

Nature of data available Some appropriate summary statistics Test statistic
Data form pairs (e.g. before/after) Calculate differences between pairs, then analyse differences as in previous table Paired T-test
Not paired (two samples), equal standard deviation Difference in means with confidence interval (two sample) T-test
Not paired (two samples), unequal standard deviation a) Difference in means with confidence interval

b) Difference in medians with confidence interval
Fishers exact

 

Where more than two groups are present, alternative approaches to the analysis include:

a) Unpaired, several groups: ANOVA instead of t-test, Kruskai-Wallis test instead of Mann-Whitney

b) Paired, several groups (e.g. measurements over time): repeated measures ANOVA

 

 


Relationship between two continuous variables

Where an association is sought between two factors or variables that may take on any number of values

 

Nature of data available Some appropriate summary statistics Test statistic
Two measurements on the same individual (e.g. age and weight) Strength of (linear) association: Correlation coefficients, or Spearman correlation coefficient if non-Normal

Magnitude of association: Linear regression
Test whether equal to zero

Test whether equal to zero

 


Analysis of survivorship between two groups

Where the probability of surviving 'event-free' is to be measured over time, or is to be compared between groups

Nature of data available Some appropriate summary statistics Test statistic
Time to an event (e.g. relapse, death) Kaplan-Meier survival estimate, N-month/N-year survival probability, median survival time Logrank test