Real-world data from the health decision maker perspective. What are we talking about?

Healthcare decision-makers are increasingly developing policies that seek information on “real-world” data providing “evidence” to support and monitor changes in clinical practice or policy decisions. Many strategies may be evaluated in experimental circumstances, but this does rarely reflect clinical practice. Due to the current focus on information and computer technology to provide safer and more efficient healthcare delivery, the amount of electronic medical records and other electronic healthcare data is increasing exponentially, and these real-world data can be used for evidence generation. This review describes why and how healthcare/policy decision making could benefit from real-world data, it introduces methods to investigate real-world clinical practice, lists potentialities of routinely collected real-world data, reviews their availability in the word, and outlines future challenges in this field.


real-world data and real-world evidence
Healthcare decisions preferably should be evidence-based and/or evaluated but reality shows that many decisions need to be taken on the basis of imperfect evidence and with uncertainty about the outcome of decisions [1].Evidence about the real-world can best be obtained from the real world practice.With "real-world data" we mean data that are not collected under experimental conditions, but data generated in routine care.There is an enormous potential to improve cure and care if the information that is generated in clinical routine is exploited for evidence generation.Secondary use of these data has greatly contributed to the area of drug safety and outcomes research.A distinction between "real-world data" and "real-world evidence" is important.As noticed by the task force of the "International Society for Pharmacoeconomics and Outcomes Research" [2] the "data" conjures the idea of simple factual information, whereas "evidence" connotes the organization of the information to inform a conclusion or judgment.Evidence is generated according to a research plan and interpreted accordingly, whereas data is only a component of the research plan.Evidence is shaped, while data simply are raw materials and alone are non-informative.
Accordingly with this statement, the current paper firstly focuses general issues concerning data organization for obtaining "realworld evidence".Available "real-world data" will be considered afterwards.

eviDence froM real-WorlD clinical pracTice
Health insurers, regulators, physicians, and patients need information on the comparative effectiveness and safety of drugs in routine care [3].This means that experimental designs should not be used, but real-life care should be studied.Although the need is evident, there is controversy around non-experimental, i.e. observational studies, even more so if they study effectiveness [4].Several controversies have arisen, for example the effectiveness of HRT as evidenced in the Nurses Health Study that was not supported in a subsequent trial, the effect of NSAIDs on Alzheimer's disease that could not be proven in clinical trials, the effects of statins on fractures which could not be seen in clinical trials.Methodological issues concerning the design, confounding, conduct of the study and interpretation of results needs special attention in an epidemiologic observational framework.

The basic observational design
We now introduce a basic design in an observational epidemiologic framework, the so called population-based cohort design.A cohort is defined by subjects meeting a set of eligibility criteria and by entry and exit time points.Consider, as hypothetical example, a cohort investigation for studying issues concerning antidiabetic therapies.Entry in the cohort may be defined by calendar time (e.g., any time after January 1, 2004), by age (any age before 40th birthday), by events (the first use of oral hypoglycaemic medication), and/or by disease status (the date of diagnosis of type 2 diabetes).Exit from the cohort may be defined by the first occurrence of specific calendar time (e.g., December 31, 2010), age (exit at 80th birthday), events (death; exit from the study; the first switching from oral hypoglycaemic therapy to insulin), and/or disease status (first occurrence of coronary heart disease).
The cohort of incident users of oral hypoglycaemic drugs may be illustrated graphically as in Figure 1.In the graph, ten subjects who entered in the cohort at the time of their first prescription of the considered drugs (e.g.incident users of oral hypoglycaemic agents) are ranked according to their cohort entry date.The restriction to new HEALTH CARE DECISIONS AND REAL WORLD DATA initiators of the study drugs (inception cohort) will mitigate those issues and will also ensure that patient characteristics are assessed before the start of the study drug and can therefore not be the consequence of the drug, similar to the principle of RCTs.Advantages of using the socalled "new user design" are recognized and well described by Ray [5].
It is important to stress that the included ten incident users reported in Figure 1 are, potentially, all individuals belonging to the target population who started therapy during the observational period.This is a first peculiarity of observational studies with respect to RCTs.As mentioned above, RCTs often select patients from clinical excellence centres, excluding patients who are more vulnerable to adverse effects of therapy in the absence, however, of a target population from which incident users arise.This means that population-based cohort studies, such as that described in Figure 1, are virtually free from external selection bias (lack of generalizability) and, hence, adequately should describe real data generated from unselected target populations.
Cohort members of incident drug users are followed to record two categories of data.The first one concerns drug exposure.Figure 1 depicts a strong heterogeneity of drug exposure for both type and duration (respectively represented by sketching and base length of rectangles).This is the second substantial peculiarity of observational studies with respect to RCTs.The last ones, in fact, are based on the minimization of exposure heterogeneity.Conversely, one main characteristic of observational studies is that they are aimed to describe drug exposure heterogeneity observed in realworld clinical practice, including heterogeneity in the compliance to treatment and deviations from guideline-based clinical recommendations, and identifying components of heterogeneity affecting the outcome onset.
The second family of data recorded during follow-up is the outcome onset.Outcome may be the disease that would be avoided or postponed by the therapy (e.g.switching to insulin as proxy of disease worsening or macrovascular events avoided as effect of a given treatment regimen) as well as events potentially linked with brief-or long-term drug safety (e.g.gastrointestinal bleeding or cancer).This is the third substantial peculiarity of observational studies compared to RCTs.The last ones, in fact, are often characterized by sample size and duration that do not allow for investigating rare outcomes and long-term effects of exposure.Conversely, large populations followed for several years from exposure starting are usually submitted to observational investigation.Besides this reference design, other ways for observing a given population have been widely used for epidemiological purposes.Among these, the nested case-control design, a direct derivation of the cohort one, has received great attention owing its higher computational efficiency compared to the cohort design [6].A complete review of observational designs proposed by the methodological and applicative literature, however, lies outside the objective of this introductive report.The review made by Suissa is a suitable introductive reading of modern approaches for designing and analysing observational studies [7].

The comparative effectiveness principle
There is a last substantial difference between observational studies and RCTs.As pointed out by Cochran about 40 years ago [8], RCTs on the efficacy of drugs for their regulatory approval study the extent to which an intervention does more good than harm under ideal circumstances ("Can it work?").For most conditions, however, physicians have a choice of two or more medications that can prevent, cure, avoid progression of, and reduce suffering from diseases.For physicians, it is therefore not a question of whether to prescribe a drug [9] but which drug among several alternatives [10].In such situations, physicians need to understand their comparative effectiveness and safety.As a matter of fact, effectiveness assesses whether an intervention does more good than harm compared to an alternative intervention, when provided under usual circumstances of healthcare practice ("Does it work in practice?").
Hence, Comparative Effectiveness Research (CER) tries to solve the issue of limited generalizability to routine care and the comparison with an active comparison group by secondary use of health care data, often from large healthcare utilization databases [11].CER then aims to produce actionable evidence regarding the effectiveness and safety of medical products and interventions as they are used outside of controlled research settings (Figure 2) [12].
It is important to be explicit about the definition of comparative effectiveness as it is applied in this issue of EBPH.Regarding the term comparative, this report will focus on the majority of circumstances when comparisons can be made between two or more active treatments and it excludes the "no treatment" or placebo option [2].

potentiality of observational approach
Observational studies utilizing existing health care data are suitable for studying several aspects of effectiveness in routine clinical practice.Two items have been considered

poTenTialiTy anD availabiliTy of rouTinely collecTeD real-WorlD DaTa
The assessment of the comparative benefits and risks of various treatment options through the analysis of real-life data is controversial.Nevertheless, policymakers, stakeholders, and providers are more prone to use large electronic databases to answer a variety of questions, such as those above reported.

Defining healthcare database
It is important to be explicit about which definition of database is applied in this paper.With this term we will focus on electronic systems which are designed and fed for collecting and electronically storing all data of interest (e.g.drug prescriptions, hospital admissions, ambulatory visits, deaths, and so forth) concerning well defined dynamic populations (e.g.those residing a given administrative area or attending assistance from a network of general practitioners).We then will use the term database for implicitly meaning population-based databases.

Types and sources of healthcare database
Databases collecting health information can be classified into two broad categories: those that collect information for administrative purposes, such as filling claims for payment (denoted as administrative or healthcare utilization (HCU) databases), and those that serve as the patient's medical record and are therefore a primary tool by which physicians track health information on their patients (denoted as medical record (MR) databases) [19].A basic description of HCU and MR data, in comparison to conventional sources of health care research, is provided in Table 1 [20][21][22][23][24].
HCU databases were initially created for administering payments to providers in nationally funded public or private healthcare delivery system [25].Healthcare programs necessarily require a system of electronic database for their management.For example, data on drug dispensations, hospital admissions and outpatient visits carried out respectively by pharmacies, hospitals and physician ambulatories accredited by the health organization, are recorded and stored since these healthcare providers must be reimbursed for their supplied services.So conceived databases, usually containing patientlevel data of health service for many millions of beneficiaries over long periods of time, can be electronically linked via a unique patient identifier.In this way, the care journey for each individual beneficiary of the health system may be tracked.In USA, typical examples of HCU databases are electronic healthcare records of large health insurance companies like United Health, or Health Maintenance Organizations like Kaiser Permanente, or of government-funded healthcare programs like Medicaid and Medicare [19].
A major weakness of these HCU databases is the instability of the population due to job changes, employers' changes of health plans, and changes in coverage for specific employees and their family members.The opportunity for longitudinal analyses is hindered by the continual turnover of plan members.Results from studies that use these databases may not be generalisable when data concerns an atypical population.For example, Medicare covers the elderly, Medicaid covers indigent and other special patient .groups,and the Veterans Administration database covers predominantly an older and possibly poorer male population.Employer-based databases, on the other hand, may represent patient populations of a relatively higher socioeconomic class.These factors limit the generalizability of study results obtained using such databases [20].

Several Member States of European
Community provide universal coverage for many health services so that stable populations and generalisable findings may be easily obtained by linking data from public healthcare delivery system (i.e. the National Health Service).Nevertheless, the use of HCU databases for research aims, albeit with significant difference between States, is not as popular as in USA, mainly because of legal and privacy issues.Rather, whole segments of the healthcare delivery system rely on MR database from primary healthcare, e.g. in the UK, The Netherlands, and Italy [26].The most prominent of these is the UK Clinical Practice Research Database (CPRD), a large physician-based computerized database of anonymous longitudinal patient records from hundreds of general practices that collect data on approximately three million patients, equivalent to approximately 5% of the UK population [24,27].Similarly, the Integrated Primary Care Information (IPCI) system from the Netherlands is a research-oriented database maintaining 500 general practitioners and covering over 1 500 000 people [28].Finally, among structured electronic MR databases available in Italy, we here remind  [21] The Framingham Heart Study [22] Medicaid database [23] General Practice Research Database [24] (a) Upcoding: coding of diagnosis or services to maximize the reimbursement Source: Gandy et al [20], partially modified  [29].In addition, the Italian MR database specifically oriented at paediatric population, the so called Pedianet database, is also available for particular applications [30].These databases include information on demographics, medical diagnoses, prescriptions, referrals to hospitals, smoking status, body mass index, immunizations, blood pressure measures and laboratory findings.

strengths and weaknesses
Both, HCU and MR databases have four key advantages for performing comparative effectiveness and safety research [31]: (1) they are available at relatively low cost; (2) their representativeness of routine clinical care makes it possible to study real-world effects; (3) the large size of covered population will shorten the time necessary to identify a sufficient number of users of a newly marketed drug [32]; (4) patient's non-response bias and recall bias are non-existent in studies based on such records, as all data were recorded prospectively and independent of patients' recall or agreement to participate in a research study [11].
A major advantage of HCU data is that they even more reflect real-world clinical practice for large and unselected populations.This is particularly true where health assistance is assured by a National Health System -NHScovering practically all citizens.However, studies based on HCU data have been criticized for the incompleteness of patients' information such as markers of clinical disease severity, lifestyle habits, and socio-economic status, among others.Indeed, because of HCU databases are aimed of reimbursing providers of health services there are not reasons for collecting information which do not influence costs of health service.In contrast, although MR data are richer of clinical and lifestyle information, they often suffer from the fact that any given practitioner provides only a piece of the care a patient receives, and specialist and hospital cares are unlikely to be recorded in a common MR database.Data quality issues, as well as the selection of general practitioners who carefully take care of their patients, are other potential limitations of studies based on MR data

froM currenT experiences To fuTure cHallenges
As said before, very large studies can be performed with real-world databases in a relatively quick and expansive way, their use being facilitated by the development of increasingly powerful computer technologies.However, their immense potential as a research resource has to be fully realized yet by administrators and database managers or information system specialists of individual health plans.With the experience gained through the use of these data and a careful understanding of the underlying healthcare system within the data were generated, computerized databases provide a highly useful data source for pharmacoepidemiology and healthcare research.Future studies should then move ahead in the direction of their more intensive, complete and integrated use.
A major concern in this field is the relatively scarce use of real-world data for decision making, most of all in some countries such as Italy.Currently, large databases are routinely used for administering payments to healthcare providers and managing care organizations.We hope however to have provided enough reasons to support the fact that potentialities of these data go beyond the simple healthcare accounting.From an academic point of view, the more natural way for stimulating intensive and complete use of HCU databases, is of increasing in-depth studies showing potentialities (and weaknesses) of this approach.For example, with the aim of assessing the association between use of oral bisphosphonates and their benefits (i.e.reduction of bone fractures risk) and suspected harms (i.e.increasing jaw osteonecrosis and gastrointestinal events) in realworld clinical practice, the Bisphosphonates Efficacy-Safety Tradeoff (BEST) study has been recently funded with grant from the Italian Drug Regulatory Agency (AIFA) [33].This study has been carried out by means of creating a network of HCU databases as a whole containing records of almost 19 million Italian citizens from five Italian Regions and ten Local Health Authorities.Yet, a study still ongoing funded with grant from the AIFA as well, concerns the assessment inappropriateness of prescribing and outcome evaluation among elderly patients affected by cardiovascular disease and other chronic comorbidities assembling HCU data from three Italian Regions.Another major concern in this field is the relatively scarce integration of HCU and MR data.As above reported, strengths and weaknesses of these data sources are in large part complementary between them.This suggest that, where feasible, multiple sources should be considered for investigating a given research question.For example, the strength of drug-outcome association estimated by HCU database may be biased by unmeasured confounders.However, if additional data sources can be identified (e.g. from MR database covering the same population and time-window as the HCU database), external adjustment of the drug-outcome association may be attempted [34].Similarly, although errors in measuring exposure are an important source of bias mainly when HCU database is used (e.g.grossly approximations of drug dose assumed by a given patient are usually used), external adjustment to measure errors may be attempted, conditionally to the availability of MR data measuring the relationship between biased and approximately true exposure level [35].
There is no obvious method for combining different electronic medical records into a uniform repository.In-depth studies using this approach is ongoing from several studies funded by grants from European Community.For example, the "Safety Of non-Steroidal anti-inflammatory drugs" (SOS) project (http:// www.sos-nsaids-project.org/), aimed of comparing the risk of cardiovascular and gastrointestinal events in users of any type of traditional nonsteroidal anti-inflammatory drugs (tNSAIDs) or Coxib, has been designed, and is still ongoing, by means of creating an international network for conducting common proto-col database studies.In particular, both HCU and MR databases containing records of, in total, more than 40 million European citizens from four different countries (Italy, Germany, The Netherlands and UK) have been assembled.Similarly, a common-protocol multisource and multi-country database has been recently designed to address drug safety concerns diabetes and antidiabetic therapies (the Safeguard Project; http://www.safeguard-diabetes.org/).Despite these promising projects, observational studies using real-world databases suffer from limited research funding opportunities.This, of course, generates the risk that academic groups become too dependent on industry funding, both in term of study questions and credibility [36].
There is a great need of independent studies investigating use, equity, effects, and costs of healthcare with robust methodologies.We then hope that national and regional governments, particularly those where consolidated welfare systems operates through the NHS, begin to fund projects in this field more often.
In the meantime, with an attempt to stimulate health care decision makers and public health researchers to address resources towards real-world data, the current issue of EBPH is devoted to go into thoroughly methodological topics and privacy concerns of real-world data.Examples of MR and HCU databases will be also presented.Finally, the so called CRACK program (Carry out a Repository for Administrative and Clinical data Knotting), including methodological, health-related, and educational projects to evaluate management of chronic conditions in real world clinical practice, will be described.

FIGURE 1
FIGURE 1 IllUstRatIon oF a FIxEd cohoRt oF tEn IncIdEnt UsERs oF a GIvEn dRUG thERapy who wERE GEnERatEd FRom a wEll-dEFInEd dynamIc popUlatIon FRom 2006 thRoUGh 2007

FIGURE 2
FIGURE 2 Goal oF compaRatIvE-EFFEctIvEnEss REsEaRch (cER) In contRast to RandomIzEd contRollEd tRIals (Rcts) Public Health -2013, Volume 10, Number 3 H EALTH CARE DECISIONS AND REAL WORLD DATA M E : O B S E R V I N G REAL WORLD CLINICAL PRACTICE Epidemiology Biostatistics and Public Health -2013, Volume 10, Number 3 HEALTH CARE DECISIONS AND REAL WORLD DATA those denoted Health Search and ULNet databases fed by approximately 900 and 220 general practitioners (GPs) respectively

e 8 9 7 9 - 7 Epidemiology
Biostatistics and Public Health -2013, Volume 10, Number 3 H EALTH CARE DECISIONS AND REAL WORLD DATA