HTA and Evaluation Methods: 3.1 Clinical effectiveness assessment in HTA

1. Seeking Information

Gather relevant data from multiple sources to understand current evidence.

2. Asking Relevant Questions

Identify key questions about effectiveness, safety, and patient outcomes.

3. Understanding Differences

Compare results across settings and patient populations to identify variations.

4. Valuing Differences

Interpret the significance of differences to guide decision-making.

1. Seeking Information

HTA bodies use clinical information to estimate what health outcomes patients might experience when using a new medicine. The first step is deciding how to collect this information.

There are three main approaches HTA bodies may use:

Reviewing existing information on the medicine’s performance
Conducting a study (e.g. a pharmacoepidemiological study) to evaluate how the medicine works in a real-world setting
Asking clinicians and patients (experts) about their experiences (e.g. surveys)

HTA bodies often use a combination of these approaches. For example:

They might use information from the medicine’s marketing authorisation holder (MAH) to conduct their own independent reviews and analyses.
When data is lacking, expert opinion can help. For example, experts might be asked whether a short-term outcome (like lowering cholesterol) could be used to predict a long-term benefit (like avoiding hospitalisation).

New clinical trials are rarely commissioned due to time and cost, so pharmacoepidemiological studies are often more practical. In some cases, decision-makers allow a medicine to be reimbursed on a conditional basis, while more evidence is collected. During this period:

Patients may access the medicine immediately
MAHs and payers can share the risk if the medicine performs worse than expected
Pricing or reimbursement conditions may be adjusted, or access limited to certain patient groups

2. Asking Relevant Questions

When assessing the clinical effectiveness of a new medicine, HTA bodies must carefully consider the outcomes it produces. Understanding these outcomes ensures the right questions are asked.

There is growing awareness that outcomes seen as important by clinicians may not always reflect what matters most to patients. For this reason, involving patients in study design is essential to ensure relevant outcomes are measured. For example, quality of life has increasingly been recognised as an important outcome for patients. This has led to the development of specific methods to measure quality of life and other patient-reported outcomes (PROs) in clinical trials and pharmacoepidemiological studies.

One approach to ensure that all the important outcomes of a particular technology are examined is to use an analytic framework – for instance, following a flowchart. Analytic frameworks are helpful to visualise all of the outcomes associated with an intervention, and to highlight areas of uncertainty.
In an analytic framework (as illustrated in Figure 1):

Arrows show cause and effect (curved arrows indicate harmful outcomes)
Sharp-cornered rectangles represent clinically relevant endpoints, which are directly perceived by the patient (e.g. reduced chest pain)
Rounded rectangles represent intermediate or surrogate endpoints, which are not directly perceived by the patient (e.g. cholesterol levels in blood)

🔑 Key Questions

Key question 1. Is screening for dyslipidaemia in children/adolescents effective in delaying the onset and reducing the incidence of CHD (coronary heart disease) related events?
Key question 2. What is the accuracy of screening for dyslipidaemia in identifying children/adolescents at increased risk of CHD-related events and other outcomes?
Key question 3. What are the adverse effects of screening (including false positives, false negatives)?
Key question 4. In children and adolescents, what is the effectiveness of medicine, diet, exercise, and combination therapy in reducing the incidence of adult dyslipidemia, and delaying the onset and reducing the incidence of CHD-related events and other outcomes (including optimal age for initiation of treatment)?
Key questions 5–8. What is the effectiveness of medicine, diet, exercise, and combination therapy for treating dyslipidemia in children/adolescents (including the incremental benefit of treating dyslipidemia in childhood)?
Key question 9. What are the adverse effects of medicine, diet, exercise, and combination therapy in children/adolescents?
Key question 10. Does improving dyslipidemia in childhood reduce the risk of dyslipidemia in adulthood?
Key question 11 (not pictured). What are the cost issues involved in screening for dyslipidemia in asymptomatic children?

A thorough assessment of clinical effectiveness should address the following questions:

How comprehensive was the information provided for a thorough HTA?

Systematic reviews conducted in a rigorous and transparent manner with extensive information provided by pharmaceutical companies are most likely to be comprehensive and balanced;
Real world studies should still consider whether what is observed is consistent with previous studies (2).

How accurate is the information?

Real-world information gathered within the health system for which a decision has to be made is most likely to be relevant;
Relying solely on reviews or summaries without a clear understanding of the methods used may not be balanced.

Is anything missing?

Important information may be overlooked if patients and healthcare providers are not involved, if searches are incomplete, or if pharmaceutical companies are not asked for full data.

How understandable is the information?

Information should reflect changes that are meaningful to patients, be interpreted correctly, and be presented in a way that is easy for non-experts to understand.

3. Understanding Differences

Once all important outcomes are identified, there may still be several challenges in comparing the effects of a new medicine with the standard of care. Outcomes may be measured in different ways, or two medicines may seem to have similar outcomes until closer inspection shows differences.

Sometimes, the outcomes that matter to patients—such as being able to return to work or get out of bed—are not directly measured. In these cases, scientists need to create measures that can be used in studies (methodologically reproducible and comparable measurement).

For example, in patients with lower back pain, a numeric pain-rating scale might be used. But this kind of ‘laboratory’ measurement needs to be translated into something that is meaningful to patients—like improved daily function or the ability to work.

Some outcomes may seem intuitive but, upon closer examination, may be difficult to interpret.

For example, a reduction of the risk of five-year mortality (death within five years) by an average of 50% does not mean that the medicine can prevent premature death in 50% of all cases. It could simply be:

Extending life expectancy from 4.9 to 5.1 years (or worse, 4.99 years to 5.01 years) in some patients, or
Extending life expectancy in very few but not extending life at all in the vast majority of others.

Even if differences in measures that are meaningful to patients are observed, these may still be ambiguous to interpret.

For example, studies may indicate a new medicine reduced the risk of hospitalisation from infection by 33%. However, this may mean different things. It could mean that:

The chance of hospitalisation is reduced by 33% relative to the chance of being hospitalised without medicine (this is called a relative reduction in risk). A 33% reduction may seem to be a considerable benefit.
If the chance of being hospitalised in the absence of the new medicine is 3 out of 1,000, then a 33% reduction reduces this to 2 out of 1,000. This means 1 out of every 1,000 people taking the medicine will benefit. This is called an absolute reduction of risk. A benefit which is for 1 out of 1000 patients may well seem marginal.

A final challenge with correctly interpreting the differences between a new health technology and the standard of care is the use and misuse of statistical tests. Statistical tests are intended to help researchers know if the differences they have detected are likely to be causally related to taking a medicine or have happened by chance. Often, this is reported in the form of a p-value. However, p-values do not reflect the magnitude (size) of the difference, or whether that difference is clinically meaningful. This means that the smaller the p-value, the higher the probability that the observed effect was not by chance.

Other statistical measures such as confidence intervals are more informative than the p-value, because they also give some sense of the size of the difference between the new health technology and the standard of care. Confidence intervals also reflect any uncertainty about the estimate of the magnitude of difference.

For example, a new medicine may be reported to reduce the chance of having a future heart attack by 33% (with a 95% confidence interval of 5% to 45%) relative to the current chance of having a heart attack.

4. Valuing Differences

The final challenge in assessing clinical effectiveness is understanding how people perceive and value differences between outcomes. For example, if a medicine increases life expectancy by 0.2 years, we still need to know:

how much a patient would value 0.2 years of additional life expectancy
if all patients experience roughly the same gains or if there are dramatic differences between patients, and
if all patients value these gains similarly.

A new medicine that increased life expectancy by an average of 0.2 years would be perceived differently if it worked in some patients but had no effect on others, when compared with a scenario where all patients gained 0.2 years with little differences across patients.

Qualitative research such as surveys or focus groups, intended to provide an understanding of which outcomes are most important to patients (you will learn more about qualitative research in Course 4).
Quantitative research uses tools like rating scales to assign numerical values to specific outcomes in different health states.

In both cases, it’s important that the instruments used for measuring preferences and values are properly tested and validated to ensure their results are reliable and meaningful.

Side Drawer

Side Drawer

1. Seeking Information

2. Asking Relevant Questions

3. Understanding Differences

4. Valuing Differences

1. Seeking Information

2. Asking Relevant Questions

🔑 Key Questions

3. Understanding Differences

4. Valuing Differences