1. Data sources for pharmacoepidemiological research

1.5. Health Database networks

Pooling data across different databases affords insight into the generalisability of the results and may improve precision. A growing number of studies use data from networks of databases, often from different countries.

From a methodological point of view, research networks have many advantages over single database studies:

  • Increase the size of study populations which facilitates research on rare events, medicines used in specialised settings, or when the interest is in subgroup effects.
  • Shorten the time needed for obtaining the desired sample size and speed-up investigation of medicine safety issues or other outcomes in case of primary data collection.
  • Benefit from the heterogeneity of treatment options across countries, which allows studying the effect of different medicines used for the same indication or specific patterns of utilisation.
  • May provide additional knowledge on the generalisability of results and on the consistency of information, for instance whether a safety issue exists in several countries. Possible inconsistencies might be caused by different biases or truly different treatment effects revealed in the databases. Addressing case definitions, terminologies, coding in databases and research practices by experts from various countries may help to increase consistency of results of observational studies.
  • Allow pooling data or results and increase the amount of information gathered for a specific issue addressed in different databases.

Different models can be applied for combining data or results from multiple databases. A common characteristic of all models is that participating partners maintain physical and operational control over electronic data and therefore the data extraction is always done locally. However, differences exist in the following areas: use of a common protocol; use of a common data model (CDM); and where and how the data analysis is done.

Use of a common data model (CDM) implies that local formats are transformed into a predefined, common data structure, which allows similar data extraction and analysis across several databases. Sometimes the CDM imposes a common terminology as well, as in the case of the OMOP CDM. The CDM can be systematically applied to the entire database (generalised CDM) or to the subset of data needed for a specific study (study specific CDM). Initially used in the US, the OMOP has since been applied in Europe, specifically as part of the Innovative Medicines Initiative (IMI) European Medical Information Framework (EMIF) and European Health Data Network (EHDEN)[1]. In the EU, study specific CDMs have generated results in several projects, but experience based on real-life studies is still limited.

The growing use of health database networks has led to development of new analytical methodologies through the elaboration of large volumes of heterogeneous data (so called Big Data), as done by the Observational Health Data Sciences and Informatics (OHDSI[2]), which developed methods and tools for building network infrastructures. Similarly, the IMI Protect initiative showed how it is possible to conduct analysis on multiple databases, through the adoption of common protocols rather than through analysis of centralized data. Moreover, EMA has coordinated a network of centres of pharmacoepidemiology and pharmacovigilance (EnCePP) consisting of about 200 public institutions and Contract research organization (CRO) involved in activities related to pharmacoepidemiology and pharmacovigilance. The use of research networks in drug safety analyses is well established and a significant body of practical experience exists. By contrast, no consensus exists on the use of such networks, or indeed of single sources of observational data, in estimating effectiveness. In the context of database networks used for pharmacovigilance, two are the initiatives of reference implemented internationally: Sentinel (FDA)[3] and EU-Adr (EU)[4].

Other international networks have been established, with the aim of increasing the power of the post marketing studies on medicines and vaccines, including Aritmo, Safeguard, Advance, Sos, Euromedicat[5] in Europe, Cnodes in Canada and Asian pharmacoepidemiology network (Aspen) in Asia and Australia.



[1] EHDEN is a European IMI five-year project that builds on the EMIF’s work by creating a federated network of Real World Data (RWD) sources. It aims to harmonise approximately 100 million EHRs in the European Union by utilizing the OMOP CDM. It will focus mostly on population data sources and outcomes-based research for validation.