Refers to the strategies used to reduce the risk of the clinician knowing, prior to randomisation, what group the patient will be allocated to it is always possible and safeguards the assignment sequence before and until allocation.
In hypothesis testing, a null hypothesis (typically, that there is no effect) is compared with an alternative hypothesis (typically, that there is an effect, or that there is an effect of a particular sign). For example, in evaluating whether a new cancer remedy works, the null hypothesis typically would be that the remedy does not work, while the alternative hypothesis would be that the remedy does work. When the data are sufficiently improbable under the assumption that the null hypothesis is true, the null hypothesis is rejected in favour of the alternative hypothesis.
Analysis of Variance
A statistical technique for testing whether or not, the means of two or more populations are equal. Also known as ANOVA.
See Analysis of Variance
A graph used to display categorical data. The horizontal axis provides the categories and the vertical axis is the frequency (or percent).
It is defined as any factor or process that tends to deviate the results or conclusions of a study systematically away from the truth. This deviation can result in distortion of the effects of an intervention.
A distribution with two modes (see mode).
Is the strategies used to ensure that, typically, trialists and/or trial participants do not know to which intervention group the participant has been allocated. Blinding safeguards the assignment sequence after allocation but is not always possible.
Box & Whisker Plot
See Box Plot.
A graphical display of the location of the quartiles of a dataset and the overall spread of the data. Also know as a box and whisker plot.
A variable whose value ranges over categories and has no numerical value, such as: red, green, blue.
Central Limit Theorem
A theorem that allows us to us the normal distribution to approximate the sampling distribution whenever the sample is large, even if the distribution of the parent population is non-normal.
A statistically significant result does not necessarily imply that it is useful in a clinical setting (does the treatment reduce a patients blood pressure by a worthwhile amount? Does it help the patient?). Clinical significance is a matter of judgement taking into account the clinical importance and applicability of the results.
A confidence interval quantifies the level of uncertainty in measurement or estimate. A 95% confidence interval is the range of values within which we can be 95% sure that the true value of a statistical measure for the whole population lies.
Continuous variable Variables which assume an infinite number of possible values. Usually obtained by measurement.
Controls are those participants in a study who did not receive the experimental drug, test or intervention.
Correlation is a measure of the relation between two or more variables. Correlation coefficients can range from -1.00 to +1.00. The value of -1.00 represents a perfect negative while a value of +1.00 represents a perfect positive correlation. A value of 0.00 represents a lack of correlation.
Critical region Region in which the null hypothesis cannot be accepted.
Degrees of Freedom
The number of degrees of freedom explains how many values are free to vary in the final calculation of a statistic.
The variable that is being predicted or explained by independent variables. It is denoted as y is regression equations.
A quantitative variable in which the scale of measurement varies in discrete steps.
Heterogeneity occurs when there is more variation between the study groups. This suggests that they are more different than would be expected by chance alone. The opposite of heterogeneity is homogeneity.
Procedure using sample statistics to test the hypothesised values.
Variable used to predict the value of a dependent variable. It is denoted by x in a regression equation.
Two events are independent if the occurrence (or non-occurrence) of one event has no affect on the probability of the other event occurring.
Samples in which the observations are taken from different objects.
Generalizing from samples to populations using probabilities. Performing hypothesis testing, determining relationships between variables, and making predictions.
A measure of dispersion that covers the central 50% of the observations in a dataset.
Level of significance
The maximum probability of a type I error that the research will tolerate in the hypothesis testing procedure.
The probability of an event based on current observations.
The mean is a particularly informative measure of the "central tendency" of the variable if it is reported along with its confidence interval. It is calculated by summing the observations and dividing by the number of observations.
The median is a measure of central tendency and is the value for which one-half (50%) of the observations (when ranked) will lie above that value and one-half will lie below that value. When the number of values in the sample is even, the median is calculated as the average of the two middle values.
This is a systematic review that employs statistical methods to combine and summarise the results of several studies, giving more weight to the most reliable studies.
A technique used to correct the imbalance in patient characteristics across treatment groups. Participants in a clinical trial will be assigned to a group in a manner that will cause the least imbalance (minimising the imbalance) in certain characteristics (or prognostic factors).
A measure of central tendency, the mode of a sample is the value which occurs most frequently in the sample.
The general purpose of multiple regression is to analyse the relationship between several independent or predictor variables and a dependent variable.
It occurs when you make several comparisons on the same dataset and has the affect of inflating the type I error rate (detecting a difference between treatments when a difference does not truly exist). When there's no underlying effect or difference in treatments, getting statistically significant result (a type I error) is supposed to be like winning the lottery. Lottery numbers are a random phenomenon and you can increase your chances of winning if you buy more tickets. This can be related to the type I error rate performing many tests on the data increases the chance that something will be statistically significant. The chance of a statistically significant result is supposed to be small when there's no underlying effect, but performing lots of tests makes it large. This is known as multiplicity.
Specifies the value of the parameter (difference between two treatments) that needs to be tested (difference between two treatments = 0).
Number needed to treat (NNT)
This is one measure of a treatments clinical effectiveness. It is the (average) number of people you would need to treat with a specific intervention (e.g. aspirin for people having a heart attack) to see one additional occurrence of a specific outcome (e.g. prevention of death).
Odds are defined as the ratio of the number of people who experience the outcome of interest compared to the number who do not. E.g. if 20 out of 100 people admitted to hospital after a heart attack die within a month, then the odds of dying within a month is 20/80 or 1 in 4.
Odds ratio (OR) is the odds of having a target disorder (a particular disease or event) in the experimental group relative to the odds of having the target disorder (a particular disease or event) in the control group. It is a way of comparing whether the probability of a certain event is the same for two groups (in a retrospective study). An odds ratio of 1 implies that the event is equally likely in both groups. An odds ratio greater than one implies that the event is more likely in the first group. An odds ratio less than one implies that the event is less likely in the first group.
A trial that has not used blinding to treatments.
Observations that do not appear to follow the characteristic distribution of the rest of the data. These may be genuine observations but could be due to measurement errors or other anomalies. All outliers should be carefully checked.
The p value is the probability of having observed our data (or more extreme data) when the null hypothesis is true. In other words, if the null hypothesis is true, the p value gives the probability of observing our data (or more extreme) by chance, so it can be thought of as a measure of strength in the belief of the null hypothesis. To illustrate: we sample a classroom of 30 children to test the null hypothesis that the population of boys and girls are on average of equal heights. A p value of 0.01 suggests that the probability of collecting the observed heights of the 30 children (or with a greater height difference between the boys and girls) is 0.01 when the overall population of boys and girls are truly of equal height.
Parameter Characteristic or measure obtained from a population.
Placebo is an inactive treatment often given to controls in trials. The placebo is delivered in a form that is identical to the active treatment being tested in the trial, in order to eliminate psychological effects on the outcome.
Statistical power is the probability of detecting a difference between two groups when one truly exists. Adequate power for a trial is widely accepted as 80%; that means there is a 20% chance of a type II error (not detecting a difference between the treatments when in fact a difference exists). Statistical power and the effect size of interest are taken into account when calculating sample size.
An ordered listing of all possible values of a random variable and their associated probabilities.
Publication bias is the tendency of investigators, reviewers and editors to differentially submit or accept manuscripts for publication on the basis of the direction or strength of the study findings. Most commonly studies with positive results are submitted and accepted for publication while negative studies (or studies which find no difference) fail to appear in the scientific literature.
Variables which assume non-numerical values.
Variables which assume numerical values.
The lower and upper quartiles (also referred to as the .25 and .75 quantiles) are the 25th and 75th percentile of the distribution (respectively). The 25th percentile of a variable is a value such that 25% of the values of the variable fall below that value.
Random allocation (or randomisation) means that each participant has a known chance, normally an equal chance, of receiving each treatment involved in the trial, but the treatment to be received cannot be predicted.
Randomised controlled trial (RCT)
Is a trial in which subjects are randomly assigned to two (or more) groups: one (the experimental group) receiving the intervention that is being tested, and the other (the comparison group or controls) receiving an alternative treatment. The groups are then followed up to see if any differences between them result.
A numerical descrition of the outcome of an experiment.
Concerned with measuring the way in which one variable is related to another. The purpose is to predict the value of a continuous variable using other independent variables.
Residuals are differences between the observed values and the corresponding values that are predicted by a model. They represent the variance that is not explained by the model. The better the fit of the model, the smaller the values of residuals.
Risk is often expressed as a proportion (or as a percentage). It is defined as the ratio of the number of people who experience the outcome of interest compared to the total number in the sample. E.g. if 20 out of 100 people admitted to hospital after a heart attack die within a month, then the risk of dying within a month is 20/100 (20%) or 1 in 5.
Relative risk (RR) or Risk ratio, is the ratio of the risk of the outcome of interest in the treated group (EER) to the risk in the control group (CER). It is typically used in randomised trials and cohort studies.
A subgroup or subset of the population.
A probability distribution of all possible values of a sample statistic.
The Shapiro-Wilks' W test is used in testing for normality.
Skewness measures the deviation of the distribution from symmetry.
The average distance the data points are spread out from the mean.
The standard error is the standard deviation of a mean.
Standardisation An important transformation that brings all values (regardless of their distributions and original units of measurement) to form a normal distribution with a mean of 0 and a standard deviation of 1.
An important transformation that brings all values (regardless of their distributions and original units of measurement) to form a normal distribution with a mean of 0 and a standard deviation of 1.
Characteristic or measure obtained from a sample.
Statistics (Not to be confused with statistic). Collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analysing, interpreting, and drawing conclusions.
The t-test is the most commonly used method to evaluate the differences in means between two groups. The groups can be independent (e.g., blood pressure of patients who were given a drug vs. a control group who received a placebo) or dependent (e.g., blood pressure of patients "before" vs. "after" they received a drug, see below). Theoretically, the t-test can be used even if the sample sizes are very small (e.g., as small as 10; some researchers claim that even smaller n's are possible), as long as the variables are approximately normally distributed and the variation of scores in the two groups are not too different.
Type I error
The error of rejecting the null hypothesis when it is true.
Type II error
The error of accepting the null hypothesis when it is false.
Validity is the degree to which a measurement truly reflects what it claims to measure. When critically appraising a paper it is important to assess whether any known biases could have affected the results (internal validity).
A measure of dispersion and is the square of the standard deviation.