1. Where Bias Can Be Introduced

1.4. During Analysis and Interpretation of Results

In a clinical trial sample, it is possible to find sub-groups of patients which respond better to treatment. Sub-group analyses involve splitting the trial participants into sub-groups. This could be based on:

  • Demographic characteristics (e.g. sex, age).

  • Baseline characteristics (e.g. a specific genomic profile).

  • Use of concomitant therapy.
Findings from sub-group analyses might be misleading for several reasons:

  • Firstly, sub-group analyses are observational (sub-groups are defined on observed participants’ characteristics) and not based on randomised comparisons. The hindsight bias, also known as the ‘I-knew-it-all-along’-bias, is the inclination to see events that have already occurred as being more predictable than they were before they took place.

  • Secondly, when multiple sub-group analyses are performed, the risk of finding a false positive result (i.e. a type I error) increases with the number of sub-group comparisons. Multiplicity issues are in general related to repeated looks at the same data set but in different ways until something ‘statistically significant’ emerges. With the wealth of data sometimes obtained, all signals should be considered carefully. Researchers must be cautious about possible over-interpretation. Techniques exist to protect against multiplicity, but they mostly require stronger evidence for statistical significance to control the overall type I error of the analysis (e.g. the Holm–Bonferroni method and the Hochberg procedure).

  • Thirdly, there is a tendency to conduct analyses comparing sub-groups based on information collected while on trial. A typical example is looking at the difference in survival between patients responding (yes/no) to treatment. Participants who are responding to treatment are by definition patients who are able to spend sufficient time on treatment to allow a response. Therefore, again by definition, they may simply represent a sub-group of patients of better prognosis, and may therefore bias the analysis. This is an example of what is often referred to as lead-time bias or guarantee-time bias. One way of dealing with this is using a landmark as a starting point for the time-to-event analysis, and creating the categories based on the participants’ characteristics at the time of this landmark (e.g. did a participant respond at three months, yes/no).