Important Aspects of Clinical Trials

Clinical trials vary considerably in size, duration and design. These factors play a major part in the interpretation of trial results. 

The most informative design is the ‘double-blind randomised comparison’ in which some participants receive the new medicine while others receive an alternative. The alternative treatment, sometimes called the ‘control’, may be either:

  • A placebo – inactive ‘dummy’ treatment.
  • An active comparator – generally consisting of a well-established treatment (the standard of care, sometimes called “the gold standard” for the illness being studied.

Participants are allocated to each study group randomly, i.e. by chance.

The trial is set up in such a way that while it is going on, neither the doctor nor the participant knows who is receiving which medicine, i.e. it is double-blind. This reduces the potential for bias in the results.

In such trials, the results are presented in terms of the difference between the group receiving the new medicine and the control group:

  • Where the comparison is against a placebo, this difference is a measure of the real effect of the new medicine.
  • Where the comparison is with an active comparator, the difference gives an insight into how the new medicine compares with current medical practice. 

In both cases, two aspects of the difference found are likely to be reported:

  1. Size: This is often reported as a ‘point estimate’ (the actual difference recorded in this particular trial) together with a ‘95% confidence interval’. This is the range within which we can be 95% sure that the true difference would be represented in the population (all patients having the disease being studied). Although you may detect a statistical significance, it may not be clinically relevant. Generally speaking, the larger this difference, the more likely it is to be clinically relevant (to increase survival by a year is of more clinical relevance than to increase it by a day).

  2. Statistical significance: Because some individuals respond better than others to treatment, there is always a risk that the difference between groups seen in a clinical trial may have arisen by chance. For example, if all the inherently good responders were randomised to one group and the bad responders to the other. Statisticians can calculate how likely it is for this scenario to have occurred in a particular clinical trial and they express their result as a ‘p-value’. This can be defined as the probability that a difference at least as large as the one observed could have arisen by chance if in reality there was no difference between the two treatments. A p-value of 0.05 means that there is a 5% or 1 in 20 chance of the difference being due to chance. It is conventionally taken as the threshold for accepting results as ‘statistically significant’. It is important to realise that the word ‘significant’ used in this sense says nothing about the clinical importance of the results – it merely offers reassurance that the result is unlikely to be accidental. For example, a one metre increase in a six-minute walk distance might, in a large enough trial, be shown to be statistically significant (i.e. unlikely to have arisen by chance) but it would never be regarded by a heart-failure patient or his doctor as being of any clinical value.

A second important group of clinical trials, often conducted to investigate long-term safety, takes the form of observational studies. In these there is no control group – everyone is treated with the new medicine and their experience is recorded. No differences between groups can arise (either accidentally or through genuine therapeutic effects) and hence there is no place for significance testing. Balanced against these shortcomings, open-label trials often include large numbers of participants (up to several thousand) studied for long periods of time (several years in some cases). These trials therefore make it easier to detect rare side effects and those that take a long time to develop.

The results of such trials list different adverse events and how frequently they were seen.