5.1. Types of Observational Studies: Cohort Studies


1. Cohort studies

A cohort is any group of individuals sharing a common characteristic. For instance, this may be:

  • Demographic factor such as age, race, or sex or born within a given time frame;
  • Constitutional factor such as blood group or immune status;
  • Behaviour or activity such as smoking or having been at a certain public event; or
  • Circumstance such as living near a toxic waste site

Cohort studies are longitudinal, observational studies, which investigate predictive risk factors and health outcomes within one or more cohorts. Cohort studies may comprise healthy persons, or may start by sampling people with a disease or condition. They differ from clinical trials, in that no intervention, treatment, or exposure (see below) is administered to the participants. Because of the observational nature of cohort studies, they predominantly serve to determine association (correlation) between an exposure and outcome (e.g., disease) rather than a causal relationship.

Exposures can be general characteristics, such as age or sex; risk factors, such as smoking or alcohol consumption; a health-related intervention; or a disease. Exposure can be categorised as present or absent or by levels of exposure, such as blood pressures.

A central feature of a cohort study is that for a cohort of all exposed persons, the risk (or the rate) for the outcome can be calculated as [(persons with both exposure and outcome) / (all exposed persons)]. If a comparison group of unexposed persons is included, a relative risk can be calculated.

Of note, a comparison group is not a defining feature of a cohort study: for example, when the aim of the cohort study is the description of the disease course or the prognosis.

Cohort studies may be prospective or retrospective.

A prospective cohort study is also called a concurrent cohort study, where the participants are followed up for a period of time (often years) and the outcomes of interest are recorded. The studies are designed before any information is collected. The outcome of interest should be common; otherwise, the number of outcomes observed will be too small to be statistically meaningful (indistinguishable from those that may have arisen by chance). All efforts should be made to avoid sources of bias such as the loss of individuals to follow up during the study. Prospective studies usually have fewer potential sources of bias and confounding than retrospective studies.

In a retrospective cohort study both the exposure and outcome have already occurred at the outset of the study. Retrospective studies therefore look backwards, e.g., examine exposures to a possible risk in relation to an observed outcome. Often, information is used that has been collected for reasons other than research, such as administrative data or medical records. While this type of cohort study is less time consuming and costly than a prospective cohort study, it is more susceptible to the effects of confounding and bias, and special care should be taken to avoid this. However, if the outcome of interest is uncommon, the size of a prospective investigation required to estimate relative risk is often too large to be feasible and retrospective studies are an alternative. In retrospective studies the odds ratio[1] provides an estimate of relative risk.

Example: Selection Bias in a Retrospective Cohort Study

In a retrospective cohort study, selection bias occurs if selection of exposed & non-exposed subjects is somehow related to the outcome.

  • Investigating occupational exposure (an organic solvent) occurring 15-20 years ago in a factory.
  • Exposed & unexposed subjects are enrolled based on employment records, but some records were lost.
  • Suppose there was a greater likelihood of retaining records of those who were exposed and got disease. Indeed, 20% of employee health records were lost or discarded, except in “solvent” workers who reported illness (1% loss)
  • Workers in the exposed group were more likely to be included if they had the outcome of interest.

Source: Challenges of Observational and Retrospective Studies, Kyoungmi Kim, Ph.D, March 8, 2017 


[1] An odds ratio (OR) is a measure of association between an exposure and an outcome. The OR represents the odds that an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure. It is the ratio of the probability a thing will happen over the probability it won’t. In numerical terms, an OR above 1 reflects an increased probability and an OR below 1 a decreased probability of the outcome of interest to happen.