Abstract Background Various methods exist to estimate disease prevalences. The aim of this study was to determine whether dispensed, self-reported and prescribed medication data could be used to estimate the prevalence of diabetes mellitus and thyroid disorders. Second, these pharmaco-epidemiological estimates were compared with prevalences based on self-reported diagnoses and doctor-registered diagnoses. Methods Data on medication for diabetes and thyroid disorders were obtained from three different sources in Flanders (Belgium) for 2008: a purely administrative database containing data on dispensed medication, the Belgian National Health Interview Survey for self-reported medication and diagnoses, and a patient record database for prescribed medication and doctor-registered diagnoses. Prevalences were estimated based on medication data and compared with each other. Cross-tabulations of dispensed medication and self-reported diagnoses, and prescribed medication and doctor-registered diagnoses, were investigated. Results Prevalences based on dispensed medication were the highest (4.39 and 2.98% for diabetes and thyroid disorders, respectively). The lowest prevalences were found using prescribed medication (2.39 and 1.72%, respectively). Cross-tabulating dispensed medication and self-reported diagnoses yielded a moderate to high sensitivity for diabetes (90.4%) and thyroid disorders (77.5%), while prescribed medication showed a low sensitivity for doctor-registered diagnoses (56.5 and 43.6%, respectively). The specificity remained above 99% in all cases. Conclusions This study was the first to perform cross-tabulations for disease prevalence estimates between different databases and within (sub)populations. Purely administrative database was shown to be a reliable source to estimate disease prevalence based on dispensed medication. Prevalence estimates based on prescribed or self-reported medication were shown to have important limitations. Introduction Epidemiology is an important part of public health practice, and estimations of disease prevalence are essential to determine the need for care and related costs.1 Disease prevalence can be estimated by direct methods such as health surveys2 or by indirect methods such as health administrative databases. The latter are patient record databases (PRDs) that facilitate retrospective analyses of health parameters, diagnoses and prescribed drugs or purely administrative databases (PAD) including detailed information on reimbursed medical procedures and dispensed medication.3,4 Health surveys provide a relatively inexpensive method of gathering prevalence information, but they have often been criticized for potential bias due to self-reporting and sampling errors, such as sample selection and non-response. PRDs also have limitations, e.g. their dependence on coding behaviour of clinicians, which can thus yield heterogeneous results.1 Finally, PADs that infer the disease status from dispensed medication are limited to conditions exclusively treated with specific medication that could be used in any stage of the disease. Furthermore, very little is known about the validity of using dispensed medication data to estimate disease prevalence. A number of studies have found that dispensed and prescribed medication data provide higher estimates for the prevalence of certain chronic conditions when compared with prevalences from registered and self-reported diagnoses.1,5 These studies make a case for using medication data as a meaningful way to estimate the prevalence of certain diseases, provided that the databases used are exhaustively and rigorously completed. Various studies have used direct and indirect methods for prevalence estimation, but few offer an insight into the validity of these different methods by comparing medication-based estimates to prevalences based on registered or self-reported diagnoses.6,7 Diabetes mellitus and thyroid disorders are chronic conditions that are relevant for general practitioners (GPs), as they often require close follow-up and management. Furthermore, these conditions fulfill the following assumptions: patients are mainly treated pharmacologically; the pharmacological treatment is specific for these conditions; and the treatment requires regular and continuous intake. Therefore, this study was designed, first, to determine whether self-reported, dispensed and prescribed medication data obtained from a national health interview survey (HIS), PAD and PRD, respectively, could be used to estimate the prevalence of diabetes and thyroid disorders. Second, these pharmaco-epidemiological estimates were compared with prevalences based on self-reported data from HIS and diagnosis-based prevalences from PRD. Methods Study design and population This study is a descriptive cross-sectional study and was approved by the Medical Ethical Committee of the University of Leuven, Belgium (mp11398). The study population included patients residing in Flanders, the Northern region of Belgium, during the period of 1 January 2008 to 31 December 2008. Flanders had a population of ∼6 161 600 at the beginning of 2008.8 Belgium provides its citizens with universal health care. Data sources PAD—Inter-Mutualistic Agency; Pharmanet Health insurance in Belgium is mandatory and covers 99% of the population. The Inter-Mutualistic Agency (IMA) is a joint venture of the seven national sickness funds and collects and manages all data on healthcare expenditures as well as Pharmanet data.9 Pharmanet logs all data on reimbursed dispensed medication from public pharmacies in Belgium.10,11 Drug information from Pharmanet is classified according to the WHO’s Anatomical Therapeutic Chemical (ATC) classification system. The ‘Echantillon Permanent/Permanente Steekproef’ (EPS, Permanent Sample) is an anonymous, representative randomized sample of 1/40th of the Belgian population from the IMA database.10 In this study, only Flemish data (n = 154 688) were included. Health survey—Belgian National HIS The second data source used in this study is the Belgian National HIS 2008, which was conducted between May 2008 and July 2009 in a sample of Belgian residents (n = 11 253).12 A detailed description of the design and sampling methods can be found elsewhere.13 Through the use of post-stratification weights, results are representative for the total Belgian population.14 The survey was carried out by Statistics Belgium15 and exempted by law from requiring ethics approval. For the purpose of this study, only the Flemish population within HIS (n = 3896) was taken into account. In 2012, an authorization demand was obtained from the Belgian Privacy Commission (no SCSZG/12/012) to link data from HIS of 2008 with IMA data, by making use of a unique identifier for each person (social security number). For 122 respondents (3.1%) the social security number could not be retrieved, leading to a linked database of 3774 HIS participants. Among those, 62 could not be identified in the IMA database. It was assumed that they were not insured and did not receive medication. The denominator in the diabetes group was 3743 and 3757 in the thyroid disorders group. These differences can be explained by non-response to specific questions regarding diabetic disease or thyroid disorders by 31 and 17 patients, respectively. The linked database is referred to as the ‘HIS-PAD database’ PRD—Intego database In addition, data were obtained from Intego, a Flemish general practice-based morbidity registration network at the Department of General Practice of the University of Leuven, Belgium.3,16 Intego procedures were approved by the ethical review board of the Medical School of the University of Leuven (N° ML 1723) and by the Belgian Privacy Commission (no SCSZG/13/079). In 2008, 86 GPs, all using the medical software program Medidoc (Corilus NV, Aalter, Belgium), collaborated in the Intego project. They worked in 55 practices evenly spread throughout Flanders, Belgium. GPs applied for inclusion in the registry. Before acceptance of their data, registration performance was audited using a number of algorithms that compared their results with those of all other applicants. Only the data of the practices with good registration performance were included in the database. The selection procedure was described in detail previously.3 Using specially framed extraction software, data were encrypted and collected from GPs’ personal computers and entered into a central database. Registered data were continuously updated and historically accumulated for each patient. Diagnoses were classified according to a very detailed thesaurus automatically linked to the International Classification of Primary Care—2nd Edition (ICPC-2) and the International Statistical Classification of Diseases and Related Health Problems—10th Revision (ICD-10). Drugs were classified according to the ATC classification system. The estimated practice population (PP) of 2008 (n = 153 092) was used as the denominator for the analyses.16 Identification of diseases and medication use In the PRD, the ICPC-2 codes used for diabetes mellitus were ‘diabetes insulin dependent’ (T89) and ‘diabetes non-insulin dependent’ (T90), and ‘malignant neoplasm thyroid’ (T71), ‘benign neoplasm thyroid’ (T72), ‘goitre’ (T81), ‘hyperthyroidism/thyotoxicosis’ (T85) and ‘hypothyroidism/myxoedema’ (T86) for thyroid disorders. No distinction was made between types of diabetes mellitus or between the different thyroid disorders. Prevalence information from HIS was collected on a list of chronic diseases, including diabetes and thyroid disorders. Respondents were asked whether they had suffered from the disease in the past 12 months. The wording of the disease was based on guidelines from Eurostat in the framework of the European Health Interview Survey.17 The ATC code used for antidiabetic medication was A10 [including insulins and analogues (A10A), blood glucose lowering drugs, excluding insulin (A10B) and other drugs used in diabetes (A10X)] and H03 for thyroid therapy [including thyroid preparations (H03A), antithyroid preparations (H03B) and iodine therapy (H03X)]. In the HIS database, two methods were used to estimate medication use: self-reported medication use in the two weeks before the interview, based on disease categories, and self-reported medication use in the past 24 h, based on brand names, which were subsequently recoded into ATC codes. Self-reported thyroid medication use in the 2 weeks before the interview was not available. Statistical analysis For the first part of the analysis, the disease prevalence rates based on medication data, clinical diagnosis and self-report were compared. PRD disease prevalences were based on the ICPC-2 codes calculated on 31 December2008. Disease prevalence based on medication was defined as at least one prescription of the aforementioned drugs in 2008. PAD prevalences were based on the total number of people with dispensed diabetes or thyroid medication in a public pharmacy in 2008 among all insured people. HIS estimates were representative of the 2008 Flemish population through the use of post-stratification weights, with the national register as auxiliary database. Sensitivity analyses consisted of repeated analyses for patients who received at least two or three prescriptions. Finally, our results were cross-tabulated within PRD (clinical diagnoses vs. prescribed drugs) and within the HIS-PAD database (self-reported diagnoses vs. dispensed drugs). Using the GPs’ coded or the patients’ self-reported diagnoses as the reference standard, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) (and the 95% CIs) of the disease prevalence based on prescribed and dispensed medication, respectively, were calculated. This was repeated for 2013 in the PRD and a sensitivity analysis with only hypothyroidism (T86) was performed. A P values < 0.05 was considered statistically significant. All statistical analyses were performed using R Software Version 3.0.3 (Free Software Foundation Inc., Boston, MA, USA), SAS 9.3 (SAS Institute, Cary NC, USA) and Stata 14.0 (StataCorp LP, College Station, TX, USA). Results Comparison of prevalence rates based on different medication databases The mean age was 41.4 years in the PAD, 40.9 years in the HIS database and 40 years in the PRD, and the proportion of women was 51, 51 and 50%, respectively (table 1). Table 1 Prevalence of diabetes and thyroid disorders according to different methods and data sources in 2008 Prevalence of diabetes Prevalence of thyroid disorders Database n % 95% CI (%) % 95% CI (%) Medication-based PAD (≥1a) 154 688 4.39 4.29–4.49 2.98 2.89–3.06 PAD (≥2a) 154 688 4.07 3.97–4.17 2.62 2.54–2.70 PAD (≥3a) 154 688 3.84 3.75–3.94 2.20 2.13–2.28 HIS (2 weeksb) 3 896 3.57 2.87–4.27 Not available Not available HIS (24 hc) 3 896 2.78 2.18–3.38 2.45 1.87–3.03 PRD (≥1d) 153 092 2.39 2.32–2.47 1.72 1.66–1.97 PRD (≥2d) 153 092 2.06 1.99–2.13 1.39 1.33–1.45 PRD (≥3d) 153 092 1.73 1.67–1.80 1.04 0.99–1.09 Diagnosis based HIS Preve 3896 3.28 2.64–3.92 2.95 2.30–3.61 PRD Prevf 153 092 3.61 3.52–3.71 2.15 2.07–2.22 Prevalence of diabetes Prevalence of thyroid disorders Database n % 95% CI (%) % 95% CI (%) Medication-based PAD (≥1a) 154 688 4.39 4.29–4.49 2.98 2.89–3.06 PAD (≥2a) 154 688 4.07 3.97–4.17 2.62 2.54–2.70 PAD (≥3a) 154 688 3.84 3.75–3.94 2.20 2.13–2.28 HIS (2 weeksb) 3 896 3.57 2.87–4.27 Not available Not available HIS (24 hc) 3 896 2.78 2.18–3.38 2.45 1.87–3.03 PRD (≥1d) 153 092 2.39 2.32–2.47 1.72 1.66–1.97 PRD (≥2d) 153 092 2.06 1.99–2.13 1.39 1.33–1.45 PRD (≥3d) 153 092 1.73 1.67–1.80 1.04 0.99–1.09 Diagnosis based HIS Preve 3896 3.28 2.64–3.92 2.95 2.30–3.61 PRD Prevf 153 092 3.61 3.52–3.71 2.15 2.07–2.22 a Minimal number of delivered prescriptions at pharmacy in the year 2008. b Self-reported medication used in the 2 weeks prior to completion of the HIS-survey. c Medication used in the 24 h prior to completion of the HIS-survey (recorded by interviewer). d Minimal number of prescriptions issued in the year 2008. e Prevalence based on self-reported diagnoses. f Prevalence based on coded diagnoses by GPs. CI, confidence interval; PAD, purely administrative database; HIS, health interview survey; PRD, patient record database. The prevalence of diabetes was highest based on dispensed medications (PAD) (4.39%, 95% CI 4.29–4.49%) and lowest based on prescribed medications (PRD) (2.39, 95% CI 2.32–2.47%). In the HIS database, the prevalence of diabetes was significantly different between the two methods used [2 weeks vs. 24 h, 3.57 vs. 2.78% (P = 0.041), respectively]. The same results were found for prevalence of thyroid disorders, where the highest prevalence was based on dispensed medications (PAD) (2.98%, 95% CI 2.89–3.06%) and the lowest on prescribed medications (PRD) (1.72%, 95% CI 1.66–1.79%). The simulations based on the use of ≥2 and ≥3 prescriptions showed an overall decline in prevalence in both conditions in all databases (table 1). PRD: sensitivity and specificity of prescribed medication The sensitivity and specificity of prescribed medication for the doctor-recorded prevalence of diabetes were 56.5 and 99.6%, respectively. For thyroid disorders, the sensitivity and specificity of prescribed medication were 43.6 and 99.2%, respectively (table 2). Table 2 Sensitivity and specificity of prescribed medications (PRD) for doctor-recorded diagnoses in 2008 Database n Prevalence (%) Sensitivity % (95% CI) Specificity % (95% CI) PPV % (95% CI) NPV % (95% CI) Diabetes PRD ≥1pr 153 092 2.39 56.5 (55.2–57.9) 99.6 (99.6–99.7) 85.4 (84.2–86.5) 98.4 (98.3–98.5) PRD ≥2pr 153 092 2.07 49.8 (48.5–51.1) 99.7 (99.7–99.8) 87.2 (86.0–88.3) 98.2 (98.1–98.2) PRD ≥3pr 153 092 1.73 42.2 (40.9–43.5) 99.8 (99.8–99.8) 88.3 (87.1–89.5) 97.9 (97.8–97.9) Thyroid disorders PRD ≥1pr 153 092 1.72 43.6 (41.9–45.4) 99.2 (99.2–99.2) 54.4 (52.5–56.3) 98.8 (98.7–98.8) PRD ≥2pr 153 092 1.39 36.2 (34.6–37.9) 99.4 (99.3–99.4) 55.7 (53.6–57.9) 98.6 (98.6–98.7) PRD ≥3pr 153 092 1.04 27.5 (25.9–29.0) 99.5 (99.5–99.6) 56.5 (54.0–58.9) 98.4 (98.4–98.5) Database n Prevalence (%) Sensitivity % (95% CI) Specificity % (95% CI) PPV % (95% CI) NPV % (95% CI) Diabetes PRD ≥1pr 153 092 2.39 56.5 (55.2–57.9) 99.6 (99.6–99.7) 85.4 (84.2–86.5) 98.4 (98.3–98.5) PRD ≥2pr 153 092 2.07 49.8 (48.5–51.1) 99.7 (99.7–99.8) 87.2 (86.0–88.3) 98.2 (98.1–98.2) PRD ≥3pr 153 092 1.73 42.2 (40.9–43.5) 99.8 (99.8–99.8) 88.3 (87.1–89.5) 97.9 (97.8–97.9) Thyroid disorders PRD ≥1pr 153 092 1.72 43.6 (41.9–45.4) 99.2 (99.2–99.2) 54.4 (52.5–56.3) 98.8 (98.7–98.8) PRD ≥2pr 153 092 1.39 36.2 (34.6–37.9) 99.4 (99.3–99.4) 55.7 (53.6–57.9) 98.6 (98.6–98.7) PRD ≥3pr 153 092 1.04 27.5 (25.9–29.0) 99.5 (99.5–99.6) 56.5 (54.0–58.9) 98.4 (98.4–98.5) PRD, patient record database; pr, prescription(s); PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval. For both diabetes and thyroid disease the sensitivity of prescribed medication was below 60% and decreased further with an increasing number of prescriptions, while specificity remained above 99%. NPV of prescribed medication was ≥98% for both diseases in all scenarios. PPV was high for prescribed diabetes medication and ranged between 85 and 88%. For prescribed thyroid medication, PPV was lower and ranged between 54 and 57%. This exercise was repeated for 2013 (data not shown). The prevalence of diabetes and thyroid disorders based on a recorded diagnosis rose to 5.34% (95% CI 5.22–5.46) and 2.45% (95% CI 2.37–2.53), respectively. The sensitivity of prescribed medication ranged between 36 and 46% and between 26% and 46% for diabetes and thyroid disorders, respectively, according to the number of prescriptions taken into account. Specificity remained ≥99% in all scenarios. The prevalence of hypothyroidism (T86) was 0.93% (0.88–0.98) in 2008 (data not shown). Sensitivity was 66, 55 and 42% for ≥1, ≥2 and ≥3 prescriptions, respectively. Specificity was ≥98% in all scenarios. HIS-PAD database: sensitivity and specificity of dispensed medication The sensitivity and specificity of dispensed medication for a self-reported diagnosis of diabetes were high, 90.4 and 98.9%, respectively. The sensitivity and specificity of dispensed medication for self-reported thyroid disorders were 77.5 and 98.5%, respectively. Notable for thyroid disorders was the sensitivity drop from 70.5 to 48.3% for ≥2 dispensed prescriptions and ≥3 dispensed prescriptions, respectively. The NPV of dispensed medication was ≥98% for both diseases in all scenarios. The PPV for dispensed diabetes medication ranged between 73 and 81%. For dispensed thyroid medication the PPV was lower and ranged between 61 and 66% (table 3). Table 3 Sensitivity and specificity of dispensed medications for self-reported diagnoses (HIS-PAD database) in 2008 Database n Prevalence (%) Sensitivity % (95% CI) Specificity % (95% CI) PPV % (95% CI) NPV % (95% CI) Diabetes HIS-PAD ≥1med 3743 4.18 90.4 (84.9–95.4) 98.9 (98.5–99.2) 73.2 (66.0–80.5) 99.7 (99.5–99.9) HIS-PAD ≥2med 3743 3.88 89.1 (83.7–94.5) 99.2 (98.9–99.5) 78.3 (71.2–85.5) 99.6 (99.4–99.8) HIS-PAD ≥3med 3743 3.77 86.6 (80.7–92.5) 99.3 (99.0–99.6) 80.5 (73.3–87.7) 99.5 (99.3–99.8) Thyroid disorders HIS-PAD ≥1med 3757 3.72 77.5 (67.0–84.0) 98.5 (98.0–99.0) 60.5 (50.6–70.3) 99.2 (99.0–99.5) HIS-PAD ≥2med 3757 3.19 70.5 (61.4–79.6) 98.9 (98.5–99.3) 65.6 (55.6–75.7) 99.1 (98.8–99.4) HIS-PAD ≥3med 3757 3.08 48.3 (37.1–59.5) 99.1 (98.7–99.5) 62.3 (50.2–74.4) 98.4 (98.0–98.9) Database n Prevalence (%) Sensitivity % (95% CI) Specificity % (95% CI) PPV % (95% CI) NPV % (95% CI) Diabetes HIS-PAD ≥1med 3743 4.18 90.4 (84.9–95.4) 98.9 (98.5–99.2) 73.2 (66.0–80.5) 99.7 (99.5–99.9) HIS-PAD ≥2med 3743 3.88 89.1 (83.7–94.5) 99.2 (98.9–99.5) 78.3 (71.2–85.5) 99.6 (99.4–99.8) HIS-PAD ≥3med 3743 3.77 86.6 (80.7–92.5) 99.3 (99.0–99.6) 80.5 (73.3–87.7) 99.5 (99.3–99.8) Thyroid disorders HIS-PAD ≥1med 3757 3.72 77.5 (67.0–84.0) 98.5 (98.0–99.0) 60.5 (50.6–70.3) 99.2 (99.0–99.5) HIS-PAD ≥2med 3757 3.19 70.5 (61.4–79.6) 98.9 (98.5–99.3) 65.6 (55.6–75.7) 99.1 (98.8–99.4) HIS-PAD ≥3med 3757 3.08 48.3 (37.1–59.5) 99.1 (98.7–99.5) 62.3 (50.2–74.4) 98.4 (98.0–98.9) HIS, Health Interview Survey; PAD, purely administrative database; med, dispensed medication(s); PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval. Discussion Dispensed medications (PAD) showed the highest prevalence for both diabetes and thyroid disorders, approximating the prevalence rates reported in other epidemiological studies.18–23 The lowest prevalences were based on prescribed medications (PRD) and self-reported medication (HIS). The sensitivity of prescribed medication for clinical diagnoses (PRD) was rather low. Conversely, the HIS-PAD database showed that dispensed medication had a higher sensitivity for self-reported diabetes and thyroid disorders. The specificities of dispensed medication (HIS-PAD database) and of prescribed medication (PRD) were high for both conditions. Recently, the NCD Risk Factor Collaboration performed a pooled analysis of 751 population-based studies and found a mean prevalence of diabetes of 5.9% for women and 7.9% for men in Northwestern Europe.18 According to WHO estimates, the prevalence of diabetes for the European region was 7.3% in 2014.19 Large epidemiological studies have shown a wide variation in the prevalence of thyroid disorders.20–22 A meta-analysis of nine studies found a mean prevalence of 3.82% for thyroid disorders in Europe.23 This study showed lower prevalences for both conditions in Flanders in 2008. However, the prevalence of these conditions increased substantially in the subsequent years, as shown by the 2013 prevalence estimates from the PRD. Furthermore, the ‘most realistic’ prevalence of diabetes in PRD is often estimated based on the combination of a registered diagnosis, laboratory values (haemoglobin A1C > 6.5%) and prescribed medication, in order to overcome the limitations of registries.24 In Portugal, one study found a slightly lower prevalence of diabetes using the pharmaco-epidemiological approach (2.52%) compared with the prevalence reported by the Portuguese National Health Authorities (3–5%) based on a national health survey.6 In a similar study in Cadiz, Spain, the prevalence of hypothyroidism was estimated based on the defined daily dose (DDD), the prescribed daily dose and patient records linked with the Andalusian Health Service. The authors concluded that patient records were a more valuable tool to estimate the prevalence of hypothyroidism than DDD.7 The 45 and Up study linked self-reported diabetes information with administrative data such as pharmaceutical claims for a sample of adults aged 45 and over in New South Wales. In the linked data, 71.4% of the participants with self-reported diabetes had a record of at least one pharmaceutical claim for diabetes-related medication, which is lower than the sensitivity based on the HIS-PAD database in this study.25 EPS (PAD data) is considered to be the most representative database for the Flemish population regarding medication, because it includes medication prescribed by both GPs and specialists. This might partly explain why the prevalence based on dispensed medications (PAD) was higher than the prevalence based on prescribed medications (PRD). PRD only registers prescriptions from GPs, so patients directly followed-up by specialists are not included. In addition, PRD only registers electronic prescriptions, which could be another reason for the lower prevalence estimates. To date, medication can be prescribed both by computer and by hand. It is conceivable that in future only electronic prescriptions will be allowed and this will probably increase the sensitivity of prescribed medication. Furthermore, PAD prevalences were higher than HIS prevalences because PAD included prescriptions for the whole year, while HIS only considered self-reported medication intake in the 2 weeks and 24 h before the interview. Compliance to medication is low, especially for chronic conditions,26 thus patients might not report the prescribed medication. The sensitivity of prescribed medication to estimate prevalence rates was rather low. Possible reasons are mentioned above such as follow-up by specialists or handwritten prescriptions, e.g. for patients visited at home or in nursing homes. This hypothesis was confirmed by the sensitivity finding of prescribed medication for diabetes in patients aged 60–79 years. Sensitivity increased to 65% and was mainly low (<50%) in young people and people aged 80 and over (data not shown). Additionally, some patients with diabetes could be treated with dietary measures and lifestyle changes only, leading to false negatives in the PRD and HIS-PAD database.23 A possible reason for the low sensitivity of prescribed thyroid medication is the presence of subclinical diagnoses, which usually do not need any treatment, but may still be registered as thyroid disorders. Approximately 6% of the general population has subclinical hypothyroidism and 1% subclinical hyperthyroidism.27 Another reason for the low sensitivity is the term ‘thyroid disorders’, which also includes goiters without thyroid dysfunction or thyroid tumors, neither of which need medication that is included in the ATC code H03. The sensitivity of prescribed thyroid medication increased by >20% when only the hypothyroidism (T86) diagnosis was considered. False negatives in the HIS-PAD database probably also included data from patients who did not retrieve their prescriptions from the pharmacy. False positives of prescribed or dispensed medication could partly be explained by taking into account that some medication is also used in other contexts, e.g. metformin for polycystic ovarian syndrome28 or levothyroxine and metformin for weight loss.29–31 However, since the proportion of false positives was smaller than that of false negatives, the actual prevalence of diabetes and thyroid disorders was probably higher than that of our calculations.11 The HIS-PAD database showed a high sensitivity and specificity for both conditions, leading to the conclusion that PAD could provide a good estimate for self-reported disease prevalence rates for diseases requiring disease-specific medication. However, although studies have shown that HIS results are adequate concerning self-reported diabetes diagnosis, they are still less reliable compared with doctor-registered diagnoses.32 Moreover, studies have shown that self-reported data such as HIS data have a lower validity than doctor-registered data.33,34 In the future, the low sensitivity of a prevalence estimate based on prescribed medications (PRD) could be increased by linking the PRD with PAD. A PRD–PAD database could provide more complete prevalence estimates based on medication data, and an ideal database to validate these prevalence estimates. Furthermore, the current study underlines the importance of continuous data quality feedback in a large data registry (PRD). Periodically offering the registrators feedback on their coding behaviour will improve the quality of their registrations.35 To our knowledge, this study was the first to perform cross-tabulations between different databases and within (sub)populations. Both the estimates of disease prevalence and the results of the cross-tabulations were taken into account to determine the validity of the data sources. The strength of this study lies in the use of three different databases, representative of the total Flemish population. The use of large sample sizes and post-stratification weights reduced the risk of inclusion bias. However, a few limitations should be noted. First, the absence of a gold standard for the estimation of the disease prevalence is a restriction in this study. Second, no data were available about the percentage of medication prescribed by specialists, prohibiting correct interpretation of the medication prescribed by GPs. Third, because not all databases were linked, it was impossible to cross-tabulate all the data. To conclude, disease prevalence estimates based on dispensed medications (PAD) were shown to be higher than disease estimates based on prescribed medications (PRD) and self-reported medication use (HIS). Furthermore, the sensitivity and specificity of dispensed medication for self-reported diagnoses was shown to be high. This study revealed important limitations in using prescribed medication and self-reported medication to estimate disease prevalences. Moreover, the importance of a future linkage between PAD and PRD, was emphasized. Funding The organization of the Belgian National Health Interview Survey was funded by the federal, regional and community health authorities in Belgium. The linkage of the HIS 2008 with the health insurance data was funded by the National Institute for Health and Disability Insurance Belgium. Intego is funded on a regular basis by the Flemish Government (Ministry of Health and Welfare). This work would not have been possible without the collaboration of all general practitioners of the Intego network. We hereby state the independence of the researchers from the funders. Conflicts of interest: None declared. Key points This study was the first to perform cross-tabulations for disease prevalence estimates between different databases and within (sub)populations. Dispensed medication data from a purely administrative database was shown to be a reliable source to estimate self-reported disease prevalence for diseases requiring disease-specific medication. Prevalence estimates based on prescribed or self-reported medication were shown to have important limitations. This study emphasizes the importance of a future linkage between the national health insurance database and a comprehensive patient record database in Belgium. References 1 Chini F, Pezzotti P, Orzella L, et al. Can we use the pharmacy data to estimate the prevalence of chronic conditions? a comparison of multiple data sources. BMC Public Health 2011; 11: 688. Google Scholar CrossRef Search ADS PubMed 2 Health Interview Survey, Scientific Insitute of Public Health, Belgium. Available at: https://his.wiv-isp.be/SitePages/Home.aspx (15 December 2016, date last accessed). 3 Truyers C, Goderis G, Dewitte H, et al. The Intego database: background, methods and basic results of a Flemish general practice-based continuous morbidity registration project. BMC Med Inform Decis Mak 2014; 14: 48. Google Scholar CrossRef Search ADS PubMed 4 Wiréhn A-BE, Karlsson HM, Carstensen JM. Estimating disease prevalence using a population-based administrative healthcare database. Scand J Public Health 2007; 35: 424– 31. Google Scholar CrossRef Search ADS PubMed 5 Orueta JF, Roberto N-S, Maider M, et al. Monitoring the prevalence of chronic conditions: which data should we use?. BMC Health Serv Res 2012; 12: 365. Google Scholar CrossRef Search ADS PubMed 6 Duarte-Ramos F, Filipa D-R, José C. Using a pharmacoepidemiological approach to estimate diabetes type 2 prevalence in Portugal. Pharmacoepidemiol Drug Saf 2006; 15: 269– 74. Google Scholar CrossRef Search ADS PubMed 7 Escribano-Serrano J, Paya-Giner C, Méndez Esteban MI, et al. Different methods used to estimate the prevalence of hypothyroidism, Cadiz, Spain. Rev Esp Salud Publica 2014; 88: 629– 38. Google Scholar CrossRef Search ADS PubMed 8 Population change - demographic balance and crude rates at national level, Eurostat. Available at: http://appsso.eurostat.ec.europa.eu/nui/show.do (3 March 2016, date last accessed). 9 Inter Mutualistic Agency, Belgium. Available at: http://www.aim-ima.be (15 December 2016, date last accessed). 10 Echantillon Permanent/Permanente Steekproef, Inter Mutualistic Agency. Available at: http://www.aim-ima.be/Permanente-steekproef-EPS?lang=fr - perm1 (15 December 2016, date last accessed). 11 Van der Heyden J, Mimildis H, Bartholomeeusen S, et al. Diabetesprevalentie in België: vergelijking van beschikbare data. Vlaams Tijdschrift voor Diabetologie 2012; 2: 6– 8. 12 Demarest S, Van der Heyden J, Charafeddine R, et al. Methodological basics and evolution of the Belgian health interview survey 1997-2008. Arch Public Health 2013; 71: 24. Google Scholar CrossRef Search ADS PubMed 13 Van der Heyden J, Van Oyen H, Berger N, et al. Activity limitations predict health care expenditures in the general population in Belgium. BMC Public Health 2015 Mar 19; 15: 267. Google Scholar CrossRef Search ADS PubMed 14 Renard D, Molenberghs G, Van Oyen H, Tafforeau J. Investigation of the clustering effect in the Belgian Health interview survey 1997. Arch Publ Health 1998; 56: 345– 61. 15 Statistics Belgium, Belgium. Available at: http://statbel.fgov.be/en/statistics/organisation/statistics_belgium/ (15 December 2016, date last accessed). 16 Bartholomeeusen S, Kim C-Y, Mertens R, et al. The denominator in general practice, a new approach from the Intego database. Fam Pract 2005; 22: 442– 7. Google Scholar CrossRef Search ADS PubMed 17 Commission Regulation (EU) No 141/2013 of 19 February 2013 implementing Regulation (EC) No 1338/2008 of the European Parliament and of the Council on Community statistics on public health and health and safety at work, as regards statistics based on the European Health Interview Survey (EHIS). J Eur Union 2013; 47: 20– 48. 18 NCD Risk Factor Collaboration (NCD-RisC). Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4.4 million participants. Lancet 2016; 387: 1513– 30. CrossRef Search ADS PubMed 19 World Health Organisation. Global report on diabetes. 2016. Available at: http://apps.who.int/iris/bitstream/10665/204871/1/9789241565257_eng.pdf (10 July 2017, date last accessed). 20 Canaris GJ, Manowitz NR, Mayor G, Ridgway EC. The Colorado thyroid disease prevalence study. Arch Intern Med 2000; 160: 526– 34. Google Scholar CrossRef Search ADS PubMed 21 Vanderpump MP, Tunbridge WM, French JM, et al. The incidence of thyroid disorders in the community: a twenty-year follow-up of the Whickham Survey. Clin Endocrinol 1995; 43: 55– 68. Google Scholar CrossRef Search ADS 22 Leese GP, Flynn RV, Jung RT, et al. Increasing prevalence and incidence of thyroid disease in Tayside, Scotland: the Thyroid Epidemiology Audit and Research Study (TEARS). Clin Endocrinol 2008; 68: 311– 6. 23 Garmendia Madariaga A, Santos Palacios S, Guillén-Grima F, Galofré JC. The incidence and prevalence of thyroid dysfunction in Europe: a meta-analysis. J Clin Endocrinol Metab 2014; 99: 923– 31. Google Scholar CrossRef Search ADS PubMed 24 Goderis G, Borgermans L, Heyrman J, et al. Type 2 Diabetes in Primary Care in Belgium: Need for Structured Shared Care. Exp Clin Endocrinol Diabetes 2009; 117: 367– 72. Google Scholar CrossRef Search ADS PubMed 25 Comino EJ, Tran DT, Haas M, et al. Validating self-report of diabetes use by participants in the 45 and Up Study: a record linkage study. BMC Health Serv Res 2013; 13: 481. Google Scholar CrossRef Search ADS PubMed 26 Krass I, Schieback P, Dhippayom T. Adherence to diabetes medication: a systematic review. Diabet Med 2015; 32: 725– 37. Google Scholar CrossRef Search ADS PubMed 27 Wiersinga WM. Subclinical hypothyroidism and hyperthyroidism. I. Prevalence and clinical relevance. Neth J Med 1995; 46: 197– 204. Google Scholar CrossRef Search ADS PubMed 28 Nieuwenhuis-Ruifrok AE, Kuchenbecker WKH, Hoek A, et al. Insulin sensitizing drugs for weight loss in women of reproductive age who are overweight or obese: systematic review and meta-analysis. Hum Reprod Update 2008; 15: 57– 68. Google Scholar CrossRef Search ADS PubMed 29 Krotkiewski M. Thyroid hormones and treatment of obesity. Int J Obes 2000; 24: S116– 9. Google Scholar CrossRef Search ADS 30 Seifarth C, Schehler B, Schneider HJ. Effectiveness of metformin on weight loss in non-diabetic individuals with obesity. Exp Clin Endocrinol Diabetes 2013; 121: 27– 31. Google Scholar PubMed 31 Desilets AR, Dhakal-Karki S, Dunican KC. Role of Metformin for Weight Management in Patients Without Type 2 Diabetes. Ann Pharmacother 2008; 42: 817– 26. Google Scholar CrossRef Search ADS PubMed 32 Beckett M, Weinstein M, Goldman N, Yu-Hsuan L. Do health interview surveys yield reliable data on chronic illness among older respondents?. Am J Epidemiol 2000; 151: 315– 23. Google Scholar CrossRef Search ADS PubMed 33 Brix TH, Kyvik KO, Hegedüs L. Validity of self-reported hyperthyroidism and hypothyroidism: comparison of self-reported questionnaire data with medical record review. Thyroid 2001; 11: 769– 73. Google Scholar CrossRef Search ADS PubMed 34 Leong A, Aaron L, Kaberi D, et al. Systematic Review and Meta-Analysis of Validation Studies on a Diabetes Case Definition from Health Administrative Records. PLoS One 2013; 8: e75256. Google Scholar CrossRef Search ADS PubMed 35 van der Bij S, Khan N, Ten Veen P, et al. Improving the quality of EHR recording in primary care: a data quality feedback tool. J Am Med Inform Assoc 2017; 24: 81– 87. Google Scholar CrossRef Search ADS PubMed © The Author 2017. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.
The European Journal of Public Health – Oxford University Press
Published: Feb 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera