The importance of historical residential address information in longitudinal studies using administrative health data

The importance of historical residential address information in longitudinal studies using... Abstract Background When information on changes in address or migration of people to or from a study jurisdiction is unavailable in longitudinal studies, issues relating to loss-to-follow-up and misclassification bias may result. This study investigated how estimations of associations between general practitioner (GP) contact and hospital use were affected by incomplete address and migration data. Methods This was a retrospective population-based cohort study of Western Australians from 1990 to 2004. Linked administrative data including mortality records, hospital admissions, primary care and Electoral Roll records were used. Regularity of GP contact, based on the variance of the number of days between GP visits, was calculated for each person-year. Outcomes were the number and costs (A$2014) of diabetes-related hospital admissions in the following year. Models were estimated separately for cohorts where (i) postcode was ascertained at study commencement and held constant, and (ii) postcode and residency within Western Australia were updated with each change of address recorded on the Electoral Roll over the study period. Results Updating address data reduced total person-years by 11% and changed the distribution of covariates. Estimations of associations between patterns of GP contact and number of hospitalizations changed; the incidence rate ratios measuring the relationship with the most regular GP contact (baseline of those with <2 GP visits) changed from 0.81 [95% confidence interval (CI) 0.66–1.00] to 0.42 (95% CI 0.33–0.53) after updating postcode information. Impacts on cost models were smaller, though still statistically significant. Conclusions Longitudinal studies using administrative data may report biased results if they ignore address changes and migration. Researchers should attempt to link to these data wherever possible, or choose study designs which these issues are less likely to affect. Custodians should be aware that such data can be vital to high quality research. Administrative data, time at risk, hospital use, dynamic cohort, migration Key Messages Studies relying on administrative data collections can suffer from missing information on the migration of cohort members, which may lead to incorrect calculations of time at risk and may bias study findings. We estimated associations between general practitioner access and hospital use in a cohort where migration was uncaptured, compared with the same cohort with complete information on migration within and outside the study jurisdiction. Measures of association were biased in the cohort without migration information. Estimates of the rate of hospitalizations were affected to a greater extent than estimates of hospitalization costs. Researchers using administrative data should access information on changes in address, especially migration out of the jurisdiction from which the health service data are available, wherever possible. Introduction Recent decades have seen a dramatic rise in the amount of electronic data recorded in the delivery of health care. Although not captured specifically for research purposes, these ‘administrative’ data are frequently used for research, facilitated in part by advances in computing power.1–3 Though studies using such data can have many strengths, a common issue is that information pertinent to the research question may be missing from administrative collections.4 The issue at the centre of this paper is migration of study members out of the study jurisdiction(s). This migration is not captured in most administrative data collections and may lead to a form of misclassification bias. Where a study population or data collection is defined by residence within a particular jurisdiction, uncaptured migration, whether to or from another state, province or country, can lead to migrants being erroneously assumed as having no health service contacts or outcomes of interest before immigration or following emigration. This will overestimate the size of the cohort/time at risk and potentially lead to inaccurate estimations of the rates, odds or hazards of events.4,5 Furthermore, where geographically derived measures of remoteness or socioeconomic status (SES) are used, uncaptured migration can lead to misclassification of these potential confounders. For these reasons, researchers often use external population data sources, such as census data, to complement health service datasets.4,6 In studies in Australia, the Electoral Roll, maintained by the Australian Electoral Commission (AEC), is frequently used either in defining a cohort or to identify control populations.7 As voting is compulsory for citizens aged 18 and over, the Electoral Roll provides almost comprehensive population data8 incorporating gender, birthdate and residential location; furthermore, address changes are actively captured (including emigration).9,10 Representativeness is diminished by under-representation of Indigenous Australians and younger adults and, furthermore, non-citizens are not captured.10 Most countries that have mature linked administrative data systems also have good quality general population data derived from Census, Electoral Roll or other national registration systems.4,11–17 However, these data are not always available for linkage, and anecdotally, studies incorporating these data generally limit their use to the definition of a study population, identification of matched controls or capture of baseline characteristics. In Australia, the AEC may release data to researchers under the Commonwealth Electoral Act 1918;18 however, this is usually only a single cross-sectional snapshot for use in defining a cohort of interest or matched controls. The rationale behind this restriction is unclear; however, it is unlikely to be for legislative reasons since precedents exist for the release of more detailed data.10 The aim of the current paper is to demonstrate the utility of comprehensive longitudinal Electoral Roll data to more accurately determine person-time at risk, and also confounding due to geographically defined SES and access to health services. This is illustrated through the evaluation of the impact of regularity of general practitioner (GP) contact on numbers and costs of potentially avoidable hospitalizations. Regular GP contact is hypothesized to improve health in chronic disease in part by improving compliance, and previously has been demonstrated to be associated with reduced risk of hospitalization and death in certain groups.19,20 Methods This was a retrospective population-based cohort study of Western Australian (WA) electors registered with Medicare (Australia’s universal public health system) between 1 July 1990 and 30 June 2004. Individual service records were obtained from the WA Mortality Register (1990 to 2004), WA Hospital Morbidity Data Collection (HMDC) (1980 to 2004) and Medicare Benefits Scheme (MBS) claims database (1984 to 2004). The WA Mortality Register details all deaths registered in WA; the HMDC includes all separation records for all hospitals in WA;21 and the MBS records all services attracting reimbursement through Medicare,22 which includes all GP services. We also obtained WA Electoral Roll data (1988 to 2004) including the first registration, and all records where address information was updated incorporating the reason for, and date of, the update. Data were linked by the WA Data Linkage Branch using probabilistic matching previously demonstrated to have low error rates, of 0.11% for both false-positives and false-negatives.23 Cohort membership Two cohorts were constructed: a fixed cohort, which people entered at the start of the first financial year following both (a) the date of their first Electoral Roll record and (b) their first indication of diabetes risk (defined in Table 1). Cohort exit was at the end of the last financial year preceding death or the study end date. Entry and exit dates follow the Australian financial year which runs from 01 July–30 June. This was termed the ‘Non-Historical Electoral Roll cohort’; a dynamic cohort where entry and final exit were as above but where periods of temporary exit and re-entry to the study cohort were captured via the updated address records through the Electoral Roll. This cohort was termed the ‘Historical Electoral Roll cohort’. Table 1. Cohorts of confirmed diabetics, people at high risk of diabetes and people with other cardiovascular risk factors as identified through Hospital Morbidity Data Collection (HMDC) and Medicare Benefits Scheme (MBS) datasets Data source  People with diabetes  People at high risk of diabetes  People with cardiovascular risk factors  HMDC  Diagnosis of diabetes in any diagnosis field on any hospital record, using the International Classification of Diseases 9th Revision, Clinical Modification (ICD-9-CM) codes 250.xx, and the International Classification of Diseases 10th Revision, Australian Modification (ICD-10-AM) codes E10.xx, E11.xx, E13.xx or E14.xx (where xx denotes any digit between 0 and 9), or  Diagnosis of impaired glucose function in any hospital record (ICD-9-CM code 790.2, ICD-10-AM codes E09.xx, R73.xx or O24.5), or A diagnosis of obesity on any hospital record (ICD-9-CM codes 278.0, 278.1, ICD-10-AM codes E65, E66.xx) in addition to being Indigenous and aged 45 or over, or  Diagnosis of ischaemic heart disease or hypertension in any hospital record (ICD-9-CM codes 401.xx -405.xx, 410.xx-415.xx, ICD-10-AM codes I10.xx-I15.xx, I20.xx-I25.xx), or  MBS  Diabetes cycle of care consultation (MBS items 2517, 2518, 2521, 2522, 2525, 2526, 2620, 2622, 2624, 2631, 2633, 2635), or Glycated haemoglobin (HbA1c) quantitation twice within any 6-month period (MBS items 66551, 66319, 66320, 2043, 2044, 1313, 1314), or Quantitation of fructosamine for management of established diabetes (MBS item 66557), or Dietetics, education or exercise physiology service for people with type 2 diabetes (MBS items 81100, 81105, 81110, 81115, 81120, 81125)  MBS record indicating use of the oral glucose tolerance test (OGTT) outside pregnancy (MBS items 66542, 66419), or MBS record indicating HbA1c quantitation (once within a 6-month period)  MBS record indicating coronary arteriography or angiocardiography (MBS items 59912, 59925, 59971, 59972, 59974), or MBS record indicating a myocardial perfusion study (MBS items 61302, 61303, 61306, 61307, 61310, 61651, 61652, 61653, 61654)  Data source  People with diabetes  People at high risk of diabetes  People with cardiovascular risk factors  HMDC  Diagnosis of diabetes in any diagnosis field on any hospital record, using the International Classification of Diseases 9th Revision, Clinical Modification (ICD-9-CM) codes 250.xx, and the International Classification of Diseases 10th Revision, Australian Modification (ICD-10-AM) codes E10.xx, E11.xx, E13.xx or E14.xx (where xx denotes any digit between 0 and 9), or  Diagnosis of impaired glucose function in any hospital record (ICD-9-CM code 790.2, ICD-10-AM codes E09.xx, R73.xx or O24.5), or A diagnosis of obesity on any hospital record (ICD-9-CM codes 278.0, 278.1, ICD-10-AM codes E65, E66.xx) in addition to being Indigenous and aged 45 or over, or  Diagnosis of ischaemic heart disease or hypertension in any hospital record (ICD-9-CM codes 401.xx -405.xx, 410.xx-415.xx, ICD-10-AM codes I10.xx-I15.xx, I20.xx-I25.xx), or  MBS  Diabetes cycle of care consultation (MBS items 2517, 2518, 2521, 2522, 2525, 2526, 2620, 2622, 2624, 2631, 2633, 2635), or Glycated haemoglobin (HbA1c) quantitation twice within any 6-month period (MBS items 66551, 66319, 66320, 2043, 2044, 1313, 1314), or Quantitation of fructosamine for management of established diabetes (MBS item 66557), or Dietetics, education or exercise physiology service for people with type 2 diabetes (MBS items 81100, 81105, 81110, 81115, 81120, 81125)  MBS record indicating use of the oral glucose tolerance test (OGTT) outside pregnancy (MBS items 66542, 66419), or MBS record indicating HbA1c quantitation (once within a 6-month period)  MBS record indicating coronary arteriography or angiocardiography (MBS items 59912, 59925, 59971, 59972, 59974), or MBS record indicating a myocardial perfusion study (MBS items 61302, 61303, 61306, 61307, 61310, 61651, 61652, 61653, 61654)  Eligibility for analysis The setting for this paper was the analysis of the association between regularity of general practitioner (GP) contact in 1 year (the exposure year), and diabetes-related hospitalization in the following year (the outcome year), with the unit of analysis being a series of pairs (or couplets) of financial years. Each person-year contributes to two couplets, as an exposure and outcome year, except the first and last years for which an individual was in the study (and years immediately following re-entry to the study area/before leaving the study area in the case of the Historical Electoral Roll cohort), which count only as an exposure and an outcome year, respectively. Individuals in the Historical Electoral Roll cohort were only deemed eligible in the financial years during which they resided in WA for a full 365 days in 2 consecutive years, as both the exposure and outcome of interest were calculated over a 1-year period. The Non-Historical cohort were eligible from study entry until death or the study end date; reflecting the methods that this study would have followed if only the typical Electoral Roll data were available. The Historical Electoral Roll cohort is therefore a subset of the Non-Historical Electoral Roll cohort, with the additional address information used to exclude ineligible person-years (which may result in the exclusion of an individual entirely). Contact with a GP Regularity of contact with a GP was defined for each financial year as the variance in the number of days between GP visits and has been described previously.24 Briefly, for each GP visit within the exposure period (in this case each financial year), the number of days since the previous GP visit (which may have been in the preceding year) was calculated. Based on these values, an annual regularity index (R) was constructed using the formula R = 1/(1 + Variance(Days). This resulted in a score between 0 and 1 for each person-year, with a score of 1 indicating perfectly regular primary care contact and a score of 0 indicating perfectly irregular primary care contact. The index requires a minimum of two services within a financial year to calculate; people were given a score of 0 for years in which they had fewer than two services. For analysis, regularity was transformed into an ordinal variable by first logarithmically transforming the score (due to its highly skewed nature) and then creating quintiles of regularity (quintile 1 representing the lowest 20% of the logarithmic regularity distribution and quintile 5 the highest 20%). Years with fewer than two GP visits were used as the reference category in all analyses and are referred to as having ‘minimal GP contact’. Note that regularity is independent of frequency of GP attendance (i.e. number of GP visits in the exposure period); frequency was captured as a count and controlled for separately in the analysis. Outcomes under study Hospitalizations where diabetes is a risk factor were identified from Davis et al.25 and used to identify diabetes-related hospitalizations initiated in each financial year. The outcomes of interest were the number and cumulated cost of diabetes-related hospital admissions within a financial year. Where inter-hospital transfers had taken place this was regarded as a single episode of care, to prevent over-counting of admissions. The cost of each episode of care was assigned based on the Australian Refined Diagnosis Related Group (AR-DRG) code recorded across all records included in the episode. The average cost for the specific AR-DRG was used to assign the cost for the episode of care as reported in the National Hospital Cost Data Collections for Western Australia,26 specific to the date of separation of each hospital record. Estimated costs were inflated to 2014 levels using the Australian Consumer Price Index. Other covariates Sex and age were determined from Electoral Roll records. Both SES using the Socio-economic Index for Areas (SIEFA) index of relative social disadvantage (IRSD),27 and accessibility to services using the Accessibility and Remoteness Index of Australia,28 were ascertained using postcode on the Electoral Roll. For the Non-Historical Electoral Roll cohort, these were ascertained at study entry and remained constant. For the Historical Electoral Roll cohort, these were updated for each change in postcode. Comorbidity was applied based on the Multipurpose Australian Comorbidity Scoring System (MACSS).29 Comorbidity was defined as having a MACCS condition, excluding diabetes, on any HMDC diagnosis field in the previous 5 years, for each study year. For each financial year, individuals were classified into one of three mutually exclusive risk groups (people with diabetes, or at high risk of diabetes or with cardiovascular risk factors) using information from their MBS and HMDC records. The criteria used to determine the three risk groups were developed in consultation with our clinical steering panel and are shown in Table 1. Individuals entered a risk group on the day they met the criteria for that group, and exited either upon moving into higher risk group or upon death. The groups are considered hierarchical; people with criteria for both ‘high risk of diabetes’ and ‘cardiovascular risk factors’ are allocated to the high-risk group. Statistical analysis Descriptive statistics were generated to evaluate the differences between the two cohorts with a focus on: (i) the number of individuals; (ii) the number of records (exposure-outcome couplets of analysis); and (iii) changes in residential location. To determine the effect of any differences between the cohorts on the estimated association between regularity of GP contact and subsequent diabetes-related hospitalizations, multivariable regression was undertaken for both cohorts for each era. Negative binomial regression was used for the number of hospitalizations using the number of days out-of-hospital in each outcome year as the exposure and generalized linear models with a gamma distribution and identity link for cost. Clustered-robust standard errors were used to account for intra-person correlation due to the repeated nature of the observations. Confounding was limited using covariate adjustment by means of a generalized propensity score (GPS) suitable for non-binary exposures.30 The GPS was constructed using the time-invariant covariates sex and Indigenous status and the time-varying covariates age, financial year of exposure, frequency of GP visits (in exposure year) and comorbidity risk group. Models were used to estimate the marginal effects of minimal GP contact (i.e. no regularity score) and each quintile of regularity at the mean of the GPS in each era for each cohort. Marginal means with 95% confidence intervals were plotted. For each quintile in each era, the marginal means, standard errors of the mean and group sizes were used to test the significance of differences between the two cohorts via unpaired t tests. Analyses were undertaken separately across two eras: Era 1 from 1990/1991 to 1998/1999 and Era 2 from 1999/2000 to 2002/2003, representing shifts in primary care policy in Australia, specifically the introduction of the Enhanced Primary Care programme. This programme provided reimbursement to GP’s for performing health assessments, care planning and case conferencing to certain patient groups, with the aim of improving the co-ordination of care.31 Note that although the study period begins in 1990/91, earlier data are used to ascertain comorbid status (which for each exposure year has a 5-year look-back) and regularity (which takes into account the date of the latest GP visit preceding each exposure year). Results As shown in Table 2, 20 261 individuals eligible for the study in the Non-Historical Electoral Roll cohort were ineligible in the Historical Electoral Roll cohort as they did not have any eligible couplets of financial years spent entirely in WA. In addition to reducing the cohort size by 6.7%, this also caused small but statistically significant changes across all factors except gender. The differences in how the cohorts were followed up (i.e. whether the cohort was fixed or dynamic) impacted on the number of records eligible for analysis. As shown in Table 3, there were 211 744 records included in the Non-Historical Electoral Roll cohort which were excluded from the Historical Electoral Roll cohort, and the characteristics of these records showed small differences on all factors evaluated. The largest differences were observed in SES and accessibility to services, where the proportions of records with ‘unknown’ status reduced from above 6% to <1% whereas, on all other variables, changes in the relative size of each category were less than 1%. Table 2. Differences in the characteristics of individuals at entry to the Non-Historical and Historical Electoral Roll cohorts Status of Individuals in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Age at cohort entry (years)d  Mean  58.63    59.18    0.55    <0.001    SD  15.21    14.85    −0.36      Gender  Female  150998  50.1  140424  49.9  −10574  −7.0  0.242    Male  150652  49.9  140965  50.1  −9687  −6.4  Indigenous status  No  274170  90.9  260369  92.5  −13801  −5.0  <0.001    Yes  10281  3.4  9201  3.3  −1080  −10.5    Unknown  17199  5.7  11819  4.2  −5380  −31.3  Risk status at cohort entry  Confirmed diabetes  41790  13.9  39926  14.2  −1864  −4.5  <0.001    High risk of diabetes  104492  34.6  93226  33.1  −11266  −10.8    Cardiovascular risk factors  155368  51.5  148237  52.7  −7131  −4.6  Calendar time period of cohort entry  Era 1 (1990/91–1998/99)  213150  70.7  200747  71.3  −12403  −5.8  <0.001    Era 2 (1999/2000–2002/03)  88500  29.3  80642  28.7  −7858  −8.9  Died during study period  No  241329  80.0  224134  79.7  −17195  −7.1  0.001    Yes  60321  20.0  57255  20.3  −3066  −5.1  SEIFA IRSDa quintile at entry into cohort  Highest disadvantage  52217  17.3  50047  17.8  −2170  −4.2    High disadvantage  79480  26.3  76892  27.3  −2588  −3.3    Moderate disadvantage  43205  14.3  41681  14.8  −1524  −3.5  0.793 (excluding unknown)  Less disadvantage  43485  14.4  42097  15.0  −1388  −3.2    Least disadvantage  70607  23.4  68204  24.2  −2403  −3.4    Unknown  12656  4.2  2468  0.9  −10188  −80.5  <0.001 (including unknown)  Accessibility to services (ARIA)b at entry into cohort  Very remote  9378  3.1  8682  3.1  −696  −7.4    Remote  5143  1.7  4972  1.8  −171  −3.3    Moderately accessible  16164  5.4  15714  5.6  −450  −2.8  0.052 (excl unknown)  Accessible  16865  5.6  16469  5.9  −396  −2.3  Highly accessible  241573  80.1  233210  82.9  −8363  −3.5  Unknown  12527  4.2  2342  0.8  −10185  −81.3  <0.001 (incl unknown)  Total number of individuals in the cohort  301650    281389    −20261  −6.7    Status of Individuals in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Age at cohort entry (years)d  Mean  58.63    59.18    0.55    <0.001    SD  15.21    14.85    −0.36      Gender  Female  150998  50.1  140424  49.9  −10574  −7.0  0.242    Male  150652  49.9  140965  50.1  −9687  −6.4  Indigenous status  No  274170  90.9  260369  92.5  −13801  −5.0  <0.001    Yes  10281  3.4  9201  3.3  −1080  −10.5    Unknown  17199  5.7  11819  4.2  −5380  −31.3  Risk status at cohort entry  Confirmed diabetes  41790  13.9  39926  14.2  −1864  −4.5  <0.001    High risk of diabetes  104492  34.6  93226  33.1  −11266  −10.8    Cardiovascular risk factors  155368  51.5  148237  52.7  −7131  −4.6  Calendar time period of cohort entry  Era 1 (1990/91–1998/99)  213150  70.7  200747  71.3  −12403  −5.8  <0.001    Era 2 (1999/2000–2002/03)  88500  29.3  80642  28.7  −7858  −8.9  Died during study period  No  241329  80.0  224134  79.7  −17195  −7.1  0.001    Yes  60321  20.0  57255  20.3  −3066  −5.1  SEIFA IRSDa quintile at entry into cohort  Highest disadvantage  52217  17.3  50047  17.8  −2170  −4.2    High disadvantage  79480  26.3  76892  27.3  −2588  −3.3    Moderate disadvantage  43205  14.3  41681  14.8  −1524  −3.5  0.793 (excluding unknown)  Less disadvantage  43485  14.4  42097  15.0  −1388  −3.2    Least disadvantage  70607  23.4  68204  24.2  −2403  −3.4    Unknown  12656  4.2  2468  0.9  −10188  −80.5  <0.001 (including unknown)  Accessibility to services (ARIA)b at entry into cohort  Very remote  9378  3.1  8682  3.1  −696  −7.4    Remote  5143  1.7  4972  1.8  −171  −3.3    Moderately accessible  16164  5.4  15714  5.6  −450  −2.8  0.052 (excl unknown)  Accessible  16865  5.6  16469  5.9  −396  −2.3  Highly accessible  241573  80.1  233210  82.9  −8363  −3.5  Unknown  12527  4.2  2342  0.8  −10185  −81.3  <0.001 (incl unknown)  Total number of individuals in the cohort  301650    281389    −20261  −6.7    aSocioeconomic index for Areas, Index of Relative Disadvantage. bAccessibility and Remoteness Index for Australia. cSignificance of difference in the relative number of individuals across categories between Non-Historical and Historical Electoral roll cohorts undertaken using chi square testing unless otherwise stated. dSignificance of difference in mean of age undertaken using two-sided t test. Table 3. Differences in the characteristics of records (couplets of exposure and outcome years) captured by the Non-Historical and Historical Electoral Roll cohorts Status of records in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Gender  Female  965810  50.3  856969  50.2  −108841  −11.3  0.029    Male  954682  49.7  851779  49.8  −102903  −10.8  Indigenous status  No  1779563  92.7  1611180  94.3  −168383  −9.5  <0.001    Yes  66625  3.5  52365  3.1  −14260  −21.4    Unknown  74304  3.9  45203  2.6  −29101  −39.2  Risk status  Confirmed diabetic  478965  24.9  420893  24.6  −58072  −12.1  <0.001    High risk of diabetes  426121  22.2  366542  21.5  −59579  −14.0    Cardiovascular risk factors  1015406  52.9  921313  53.9  −94093  −9.3  SEIFA IRSDa quintile  Highest disadvantage  329995  17.2  308175  18.0  −21820  −6.6  <0.001  (including and excluding unknown)    High disadvantage  499152  26.0  473145  27.7  −26007  −5.2    Moderate disadvantage  266558  13.9  251966  14.7  −14592  −5.5    Less disadvantage  277003  14.4  261501  15.3  −15502  −5.6    Least disadvantage  427909  22.3  404877  23.7  −23032  −5.4    Unknown  119875  6.2  9084  0.5  −110791  −92.4  Accessibility to services (ARIA)b  Very remote  50333  2.6  45175  2.6  −5158  −10.2  <0.001  (including and excluding unknown)    Remote  32382  1.7  30474  1.8  −1908  −5.9    Moderately accessible  99184  5.2  93730  5.5  −5454  −5.5    Accessible  105445  5.5  100362  5.9  −5083  −4.8    Highly accessible  1513839  78.8  1430461  83.7  −83378  −5.5    Unknown  119309  6.2  8546  0.5  −110763  −92.8  Era in which services were provided  Era 1 (1990/91–1998/99)  1051127  54.7  943226  55.2  −107901  −10.3  <0.001  Era 2 (1999/00–2002/03)  869365  45.3  765522  44.8  −103843  −11.9  Number of GP visits in exposure year  Less than two GP visits  252939  13.2  207785  12.2  −45154  −17.9  <0.001  Two or more GP visits  1667553  86.8  1500963  87.8  −166590  −10.0  Total number of records captured     1920492    1708748    −211744  −11.0    Status of records in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Gender  Female  965810  50.3  856969  50.2  −108841  −11.3  0.029    Male  954682  49.7  851779  49.8  −102903  −10.8  Indigenous status  No  1779563  92.7  1611180  94.3  −168383  −9.5  <0.001    Yes  66625  3.5  52365  3.1  −14260  −21.4    Unknown  74304  3.9  45203  2.6  −29101  −39.2  Risk status  Confirmed diabetic  478965  24.9  420893  24.6  −58072  −12.1  <0.001    High risk of diabetes  426121  22.2  366542  21.5  −59579  −14.0    Cardiovascular risk factors  1015406  52.9  921313  53.9  −94093  −9.3  SEIFA IRSDa quintile  Highest disadvantage  329995  17.2  308175  18.0  −21820  −6.6  <0.001  (including and excluding unknown)    High disadvantage  499152  26.0  473145  27.7  −26007  −5.2    Moderate disadvantage  266558  13.9  251966  14.7  −14592  −5.5    Less disadvantage  277003  14.4  261501  15.3  −15502  −5.6    Least disadvantage  427909  22.3  404877  23.7  −23032  −5.4    Unknown  119875  6.2  9084  0.5  −110791  −92.4  Accessibility to services (ARIA)b  Very remote  50333  2.6  45175  2.6  −5158  −10.2  <0.001  (including and excluding unknown)    Remote  32382  1.7  30474  1.8  −1908  −5.9    Moderately accessible  99184  5.2  93730  5.5  −5454  −5.5    Accessible  105445  5.5  100362  5.9  −5083  −4.8    Highly accessible  1513839  78.8  1430461  83.7  −83378  −5.5    Unknown  119309  6.2  8546  0.5  −110763  −92.8  Era in which services were provided  Era 1 (1990/91–1998/99)  1051127  54.7  943226  55.2  −107901  −10.3  <0.001  Era 2 (1999/00–2002/03)  869365  45.3  765522  44.8  −103843  −11.9  Number of GP visits in exposure year  Less than two GP visits  252939  13.2  207785  12.2  −45154  −17.9  <0.001  Two or more GP visits  1667553  86.8  1500963  87.8  −166590  −10.0  Total number of records captured     1920492    1708748    −211744  −11.0    aSocioeconomic index for Areas, Index of Relative Disadvantage. bAccessibility and Remoteness Index for Australia. cSignificance of difference in the relative number of records across categories between the Non-Historical and Historical Electoral Roll cohorts undertaken using chi square testing unless otherwise stated. Figure 1 shows that only 57% of individuals did not change their postcode during the study period and thus would have had their postcode correctly identified for their entire follow up with Non-Historical Electoral Roll data. The impact on SES was less dramatic, with only 15% of individuals having one change in SES and less than 5% having two or more. Accessibility to services was more stable, with only 4% having one change in accessibility and less than 1% having two or more changes. Figure 1 View largeDownload slide The number and percentage of cohort members with changes in location information during the study period according to Electoral Roll records. Figure 1 View largeDownload slide The number and percentage of cohort members with changes in location information during the study period according to Electoral Roll records. Regularity of GP contact was closely related to the volume of GP contact; in both the historical and non-historical cohorts, those in the lowest quintile had a mean of 4.4 visits [standard deviation (SD) 2.6] compared with the highest quintile with 21.0 (SD 14.5). Table 4 shows the results of modelling the association between regularity of GP access and diabetic-related hospitalization in the following year for both cohorts. Table 4A presents the magnitude of relative effect of regularity on rates of diabetic-related hospitalizations, derived from negative binomial regression models. The relative effect of regularity was greater for the Historical Electoral Roll cohort, particularly in the highest quintile of regularity where the relative effect was nearly doubled [incidence rate ratio (IRR) in Era 1: 0.49, 95% CI 0.40 to 0.60, in Era 2: 0.42, 95% CI 0.33 to 0.53] compared with that observed for the Non-Historical Electoral Roll cohort (Era 1: 0.80, 95% CI 0.67 to 0.97, in Era 2: 0.81, 95% CI 0.66 to 1.00). The predicted rate of diabetic related hospitalization from the models are shown in Figure 2 (A and B), showing that at minimal GP contact or low regularity, the Non-Historical Electoral Roll cohort underestimated the rate of diabetic-related hospitalization, whereas at higher regularity, it tended to be overestimated. Differences in marginal means between the Historical Electoral Roll cohort and Non-Historical Electoral Roll cohorts were significant for the groups with minimal GP visits and highest regularity in both Eras, and for those with the lowest regularity in the first Era only. Table 4. The impact of Historical Electoral Roll data on the relationship between regularity of attendance with a general practitioner and (A) the rate of diabetic-related hospitalization in the following year, from negative binomial regression models, and (B) the cost (A$ 2014) of diabetic-related hospitalization in the following year, from generalised linear regression models   Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    A. The rate of diabetic related hospitalization in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  0.65  0.57  0.75  <0.001  0.55  0.47  0.64  <0.001  0.61  0.52  0.71  <0.001  0.43  0.36  0.51  <0.001  Quintile 2  0.50  0.43  0.59  <0.001  0.37  0.32  0.44  <0.001  0.52  0.44  0.62  <0.001  0.35  0.29  0.43  <0.001  Quintile 3  0.56  0.47  0.65  <0.001  0.39  0.33  0.47  <0.001  0.67  0.56  0.81  <0.001  0.41  0.33  0.50  <0.001  Quintile 4  0.67  0.57  0.80  <0.001  0.43  0.35  0.52  <0.001  0.67  0.55  0.81  <0.001  0.37  0.30  0.46  <0.001  Highest quintile of regularity  0.80  0.67  0.97  0.021  0.49  0.40  0.60  <0.001  0.81  0.66  1.00  0.0550  0.42  0.33  0.53  <0.001  Generalized propensity score  0.70  0.56  0.86  0.001  1.40  1.12  1.76  0.004  1.33  1.02  1.74  0.0350  3.11  2.31  4.18  <0.001  Constant  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001      Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    A. The rate of diabetic related hospitalization in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  0.65  0.57  0.75  <0.001  0.55  0.47  0.64  <0.001  0.61  0.52  0.71  <0.001  0.43  0.36  0.51  <0.001  Quintile 2  0.50  0.43  0.59  <0.001  0.37  0.32  0.44  <0.001  0.52  0.44  0.62  <0.001  0.35  0.29  0.43  <0.001  Quintile 3  0.56  0.47  0.65  <0.001  0.39  0.33  0.47  <0.001  0.67  0.56  0.81  <0.001  0.41  0.33  0.50  <0.001  Quintile 4  0.67  0.57  0.80  <0.001  0.43  0.35  0.52  <0.001  0.67  0.55  0.81  <0.001  0.37  0.30  0.46  <0.001  Highest quintile of regularity  0.80  0.67  0.97  0.021  0.49  0.40  0.60  <0.001  0.81  0.66  1.00  0.0550  0.42  0.33  0.53  <0.001  Generalized propensity score  0.70  0.56  0.86  0.001  1.40  1.12  1.76  0.004  1.33  1.02  1.74  0.0350  3.11  2.31  4.18  <0.001  Constant  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001      Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    B. The cost (A$ 2014) of diabetic-related hospitalizations in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  −820.6  −973.0  −668.2  <0.001  −1097.3  −1263.2  −931.5  <0.001  −645.3  −710.6  −580.0  <0.001  −921.3  −998.7  −843.8  <0.001  Quintile 2  −1090.3  −1242.5  −938.1  <0.001  −1414.6  −1581.1  −1248.1  <0.001  −722.9  −794.1  −651.6  <0.001  −1047.8  −1130.8  −964.7  <0.001  Quintile 3  −984.8  −1142.9  −826.8  <0.001  −1350.6  −1520.8  −1180.3  <0.001  −628.6  −708.7  −548.6  <0.001  −950.0  −1041.4  −858.5  <0.001  Quintile 4  −861.7  −1033.3  −690.2  <0.001  −1287.0  −1467.5  −1106.4  <0.001  −572.4  −662.0  −482.9  <0.001  −949.1  −1047.3  −851.0  <0.001  Highest quintile of regularity  −593.7  −772.5  −414.9  <0.001  −1046.7  −1235.1  −858.3  <0.001  −293.8  −403.5  −184.1  <0.001  −701.4  −822.7  −580.1  <0.001  Generalized propensity score  617.3  473.8  760.8  <0.001  1113.0  974.5  1251.4  <0.001  951.0  834.9  1067.0  <0.001  1308.2  1192.0  1424.4  <0.001  Constant  1803.9  1653.4  1954.3  <0.001  1928.7  1766.3  2091.0  <0.001  1060.0  996.5  1123.5  <0.001  1245.8  1172.0  1319.6  <0.001      Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    B. The cost (A$ 2014) of diabetic-related hospitalizations in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  −820.6  −973.0  −668.2  <0.001  −1097.3  −1263.2  −931.5  <0.001  −645.3  −710.6  −580.0  <0.001  −921.3  −998.7  −843.8  <0.001  Quintile 2  −1090.3  −1242.5  −938.1  <0.001  −1414.6  −1581.1  −1248.1  <0.001  −722.9  −794.1  −651.6  <0.001  −1047.8  −1130.8  −964.7  <0.001  Quintile 3  −984.8  −1142.9  −826.8  <0.001  −1350.6  −1520.8  −1180.3  <0.001  −628.6  −708.7  −548.6  <0.001  −950.0  −1041.4  −858.5  <0.001  Quintile 4  −861.7  −1033.3  −690.2  <0.001  −1287.0  −1467.5  −1106.4  <0.001  −572.4  −662.0  −482.9  <0.001  −949.1  −1047.3  −851.0  <0.001  Highest quintile of regularity  −593.7  −772.5  −414.9  <0.001  −1046.7  −1235.1  −858.3  <0.001  −293.8  −403.5  −184.1  <0.001  −701.4  −822.7  −580.1  <0.001  Generalized propensity score  617.3  473.8  760.8  <0.001  1113.0  974.5  1251.4  <0.001  951.0  834.9  1067.0  <0.001  1308.2  1192.0  1424.4  <0.001  Constant  1803.9  1653.4  1954.3  <0.001  1928.7  1766.3  2091.0  <0.001  1060.0  996.5  1123.5  <0.001  1245.8  1172.0  1319.6  <0.001    Era is the calendar time period the exposure was ascertained (Era 1 (1990/91–1998/99) and Era 2 (1999/2000–2002/03). aIncidence rate ratio: ratio of the incidence rate of diabetic-related hospitalizations in the following year for each scenario compared with the incidence rate in the reference scenario. bAll analyses control for confounding using a generalized propensity score constructed including gender, age, indigenous status, year of exposure, frequency of general practitioner contact, comorbidities, diabetes risk level, socioeconomic disadvantage and service accessibility. cReference category is fewer than two GP visits in the exposure year (i.e. no regularity score). dCoefficient: difference in the mean cost (A$ 2014) of diabetic-related hospitalization in the following year compared with that of the reference scenario. Figure 2 View largeDownload slide Predicted rates and costs of diabetic related hospitalisations by quintile of regularity, Historical and Non-Historical Electoral Roll cohorts presented separately. Figure 2 View largeDownload slide Predicted rates and costs of diabetic related hospitalisations by quintile of regularity, Historical and Non-Historical Electoral Roll cohorts presented separately. Table 4 (B) and Figure 2 (C and D) show results for the evaluation of the association of regularity and cost of diabetic hospitalization using generalized linear models, showing similar results. The models show a larger relative effect size for the Historical Electoral Roll cohort. The coefficient for the highest quintile of regularity in Era 1 was −1046.7 (95% CI −1235.1 to −858.3) in the Historical Electoral Roll cohort compared with −593.7 (95% CI −772.5 to −414.9) in the Non-Historical cohort; in Era 2, coefficients were −701.4 (95% CI −822.7 to −580.1) and −293.8 (95% CI −403.5 to −184.4) for the Historical and Non-Historical cohorts, respectively. Figure 2 (C and D), presenting marginal mean costs, indicates that this is largely driven by an underestimation of the cost at minimal GP contact and lower levels of regularity, whereas estimates of mean costs were similar at higher levels of regularity. Differences in marginal means between cohorts were significant for the minimal GP contact group in both Eras and for the least regular quintile in Era 2 only, and were non-significant in other cases. Discussion Summary of findings These results suggest that where longitudinal studies do not follow participants to update residential addresses, results may be biased. In this analysis, when data were updated with historical address information, the cohort size and person-time at risk were reduced by 6.7% and 11.0%, respectively, and changes in SES and remoteness were unrecorded. Effect estimates were also biased; the negative associations between regularity of GP contact and both number and cost of hospitalizations were significantly lower when historical address information was not used. Interpretation of results Records appearing in the Non-Historical Electoral Roll cohort that were not included in the Historical Electoral Roll cohort represent years in which individuals were out of WA or incarcerated for all or part of the year, and therefore had no or limited opportunity to interact with the WA Health system. These records would have the effect of suggesting an association between lower regularity of GP access and lower hospital use. In both cohorts, individuals with minimal GP contact tended to have higher levels of hospital use. The inclusion of the erroneous records in the Non-Historical analysis reduced the magnitude of this association, with the number and cost of hospitalizations underestimated in those with minimal GP contact, and minor differences between estimates at the remaining quintiles of regularity. There are clear policy implications for the publication of such biased results. Whereas the results using Historical Electoral Roll data may provide an incentive for a policy maker to promote GP access or remove barriers to access, in the expectation of reduced demand on hospitals, the publication of biased analyses based on the Non-Historical Electoral Roll data would only result in a diminished incentive for action in this area. Results for hospitalization rates were biased to a greater extent than hospitalization costs, which may be due to the different modelling techniques used. The evaluation of hospitalization rates incorporated person-time at risk as the denominator, a metric significantly changed by the make-up of the two cohorts. The GLM method used for evaluating the cost of hospitalizations does not use a denominator and would not have been as affected by the overestimation in person-time at risk. It is likely that inaccuracy in SES and remoteness in the analysis using Non-Historical Electoral Roll data affected effect estimates, given their known influence on access to services,32–34 but it is unclear how this might have biased findings. Implications and generalizability The literature presents examples of studies which may have been similarly strengthened by the inclusion of historical address information. Robbins et al. identified a low-income population with diabetes from Philadelphia Health Care Centres, to assess racial/ethnic disparities in hospital use.35 The authors acknowledge a lack of data concerning admissions outside Pennsylvania and movements out-of-State as a limitation. Similarly, Bindman et al. examined the impact of breaks in public insurance coverage on the risk of hospitalization for certain conditions in California, and again acknowledge a lack of data on hospitalizations out-of-State as potentially impacting on findings.36 Authors in both papers acknowledge that data on hospitalizations out-of-state could have improved the accuracy of results; information on migration could have been similarly useful by allowing time at risk to be corrected. Approximately 2–3% of West Australian residents emigrate either interstate or overseas each year37,38 (workings not shown), though rates for citizens, as included in this analysis, may differ. Researchers studying populations with lower migration rates (for example due to age or health status) may be less concerned about these issues. Strengths and limitations This study was based on a large dataset including all West Australian electors, and we consider the findings to be generalizable to most developed nations. Aside from the address data used, the two analyses were identical and we can be confident in the internal validity of this work. The data available for this study are from 1990 to 2004, and therefore the association between regularity and hospitalization may have changed more recently. However, describing the contemporaneous relationship was not the purpose of this study and thus the age of the data is immaterial. There are almost certainly incorrect links in the data used, though these are likely rare with the system used to perform the linkages, previously demonstrated to have proportions of both false-positives and false-negatives of 0.11%.23 Although the analysis included here is a useful example of this issue, it does not provide any indication of the magnitude or direction of any bias that may be introduced into other studies which will differ in designs, cohorts, follow-up period and so on. For example, the missing address information might have impacted on this analysis differently had we used a time-to-event methodology or examined a different cohort where migration patterns might have been different. We use an example in which both the exposure (GP contact) and outcome (hospital use) can be affected by the issues described, thus this may be an ‘extreme’ example of how missing historical address data may impact upon results. Furthermore, we cannot estimate the extent to which the difference in results between the two cohorts here arise from incorrect ascertainment of SES/remoteness as opposed to overestimation of time at risk. Where time at risk was overestimated, individuals may have remained in the Historical Electoral Roll Cohort (but contributing fewer person-years), or may have been removed entirely; the analysis here does not differentiate between these processes. Recommendations Based on these findings, we urge the AEC to make historical address data available to approved researchers. Researchers using administrative data in other countries should seek to access similar datasets wherever available. Where such data are unavailable, researchers must take care in the design phase and when interpreting results of their own studies. For example, researchers may choose to limit follow-up periods to minimize the likelihood of migration, may estimate migration rates based on external data on a similar population to the study cohort, or may access data from multiple jurisdictions to improve cross-border capture of outcomes of interest. Conclusion This study has demonstrated that incomplete address data can bias findings in studies relying on administrative data. In this example, the bias introduced through incomplete address information caused a reduction in the apparent magnitude of the association between regular GP access and hospitalization. Although the AEC records changes in address for all electors, researchers are generally only provided with the address at a single point in time. We urge the AEC to provide these data where requested, and encourage researchers outside of Australia to utilise similar data wherever possible. Funding This work was supported by the National Health and Medical Research Council, project grant APP1078345. Conflict of interest: None declared. References 1 Langan SM, Benchimol EI, Guttman A, et al.   Setting the RECORD straight: developing a guideline for the REporting of studies Conducted using Observational Routinely collected Data. Clin Epidemiol  2013; 5: 29– 31. Google Scholar PubMed  2 Hobbs MST, McGall MG. Health statistics and record linkage in Australia. J Chronic Dis  1970; 23: 375– 81. Google Scholar CrossRef Search ADS PubMed  3 Armstrong BK, Kricker A. Record linkage – a vision renewed. Aust N Z J Public Health  1999; 23: 451– 52. Google Scholar CrossRef Search ADS PubMed  4 Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Ann Rev Public Health  2011; 32: 91– 108. Google Scholar CrossRef Search ADS   5 Cameron CM, Purdie DM, Kliewer EV, Wajda A, McClure RJ. Population health and clinical data linkage: the importance of a population registry. Aust N Z J Public Health  2007; 31: 459– 63. Google Scholar CrossRef Search ADS PubMed  6 Sibthorpe B, Kliewer E, Smith L. Record linkage in Australian epidemiological research: health benefits, privacy safeguards and future potential. Aust J Public Health  1995; 19: 250– 56. Google Scholar CrossRef Search ADS PubMed  7 Mitchell RJ, Cameron CM, Bambach MR. Data linkage for injury surveillance and research in Australia: perils, pitfalls and potential. Aust N Z J Public Health  2014; 38: 275– 80. Google Scholar CrossRef Search ADS PubMed  8 Australian Electoral Commission. Enrolment Statistics . Canberra: Australian Electoral Commission, 2016. 9 Australian Electoral Commission. Enrolment—Frequently Asked Questions . Canberra: Australian Electoral Commission, 2016. 10 Moorin RE, Holman CDJ, Garfield C, Brameld KJ. Health related migration: evidence of reduced ‘urban-drift’. Health Place  2006; 12: 131– 40. Google Scholar CrossRef Search ADS PubMed  11 Ludvigsson J, Otterblad-Olausson P, Petterson B, Ekbom A. The Swedish personal identity number: possibilities and pitfalls in healthcare and medical research. Eur J Epidemiol  2009; 24: 659– 67. Google Scholar CrossRef Search ADS PubMed  12 Australian Bureau of Statistics. Census of Population and Housing . Canberra: Australian Bureau of Statistics, 2016. PubMed PubMed  13 Norwegian Tax Administration. National Registry Norway . Oslo: Norwegian Tax Administration. 14 Cameron C, Purdie D, Kliewer E, McClure R. Differences in prevalence of pre-existing morbidity between injured and non-injured populations. Bull World Health Organ  2005; 83: 345– 52. Google Scholar PubMed  15 Australian Electoral Commission. About the Commonwealth Electoral Roll . Canberra: Australian Electoral Commission, 2016. 16 National Records of Scotland. Welcome to Scotland's Census  Edinburgh: National Records of Scotland, 2016. 17 Office for National Statistics. 2011 Census England . Cardiff, UK: Office for National Statistics, 2011. PubMed PubMed  18 Australian Electoral Commission. Supply of Elector Information for Use in Medical Research . Canberra: Australian Electoral Commission, 2016. 19 Einarsdottir K, Preen DB, Emery JD, Holman CDJ. Regular primary care plays a significant role in secondary prevention of ischaemic heart disease in a Western Australian cohort. J Gen Intern Med  2011; 26: 1092– 97. Google Scholar CrossRef Search ADS PubMed  20 Einarsdottir K, Preen DB, Emery JD, Kelman C, Holman CDJ. Regular primary care lowers hospitalization risk and mortality in seniors with chronic respiratory diseases. J Gen Intern Med  2010; 25: 766– 73. Google Scholar CrossRef Search ADS PubMed  21 Data Linkage Western Australia. Core Data Collections . Perth, WA: WA Department of Health, 2017. 22 van Gool K, Parkinson B, Kenny P. Medicare Australia Data for Research: An Introduction . Sydney, NSW: Centre for Health Economics Research and Evaluation, 2011. 23 Holman CDJ, Bass AJ. Population-based linkage of health records in Western Australia: development of a health services research linked database. Aust N Z J Public Health  1999; 23: 453– 59. Google Scholar CrossRef Search ADS PubMed  24 Gibson DAJ, Moorin RE, Preen D, Emery J, Holman CDJ. Enhanced primary care improves GP service regularity in older patients without impacting on service frequency. Aust J Prim Health  2012; 18: 295– 303. Google Scholar PubMed  25 Davis WA, Hendrie DH, Knuiman MW, Davis TME. Determinants of diabetes-attributable non-blood glucose-lowering medication costs in type 2 diabetes. Diabetes Care  2005; 28: 329– 36. Google Scholar CrossRef Search ADS PubMed  26 Commonwealth Department of Health and Ageing. National Hospital Cost Data Collection, Rounds 1–9 . Canberra: Commonwealth Department of Health and Ageing. 27 Australian Bureau of Statisticsr. Census of Population and Housing: Socio Economic Indexes for Areas (2001, 2006 & 2011) . Canberra: Australian Bureau of Statistics, 2001, 2006, 2011. 28 Australian Population and Migration Research Centre. ARIA (Accessibility / Remoteness Index of Australia) . Adelaide, SA: University of Adelaide, 2015. 29 Holman CDJ, Preen DB, Baynham NJ, Finn JC, Semmens JB. A multipurpose comorbidity scoring system performed better than the Charlson index. J Clin Epidemiol  2005; 58: 1006– 14. Google Scholar CrossRef Search ADS PubMed  30 Guardabascio B, Ventura M. Estimating the dose-response function through a generalized linear model approach. Stata J  2014; 14: 141– 58. 31 Blakeman T, Harris M, Comino E, Zwar N. Implementation of the enhanced primary care items requires ongoing education and evaluation. Aust Fam Physician  2001; 30: 75– 77. Google Scholar PubMed  32 Basu J, Friedman B, Burstin H. Primary care, HMO enrollment, and hospitalization for ambulatory care sensitive conditions. Med Care  2002; 40: 1260– 69. Google Scholar CrossRef Search ADS PubMed  33 Billings J, Anderson GM, Newman LS. Recent findings on preventable hospitalizations. Health Aff  1996; 15: 239– 49. Google Scholar CrossRef Search ADS   34 Bindman AB, Grumbach K, Osmond D, et al.   Preventable hospitalizations and access to health care. JAMA  1995; 274: 305– 11. Google Scholar CrossRef Search ADS PubMed  35 Robbins J, Webb DA. Hospital admission rates for a racially diverse low-income cohort of patients with diabetes: The Urban Diabetes Study. Am J Public Health  2006; 96: 1260– 64. Google Scholar CrossRef Search ADS PubMed  36 Bindman AB, Chattopadhyay A, Auerback GM. Interruptions in Medicaid coverage and risk for hospitalization for ambulatory care-sensitive conditions. Ann Intern Med  2008; 149: 854– 60. Google Scholar CrossRef Search ADS PubMed  37 Australian Bureau of Statistics. Migration, Australia 2014–15 . Canberra: Australian Bureau of Statistics, 2016. 38 Australian Bureau of Statistics. Australian Demographic Statistics: Estimated Resident Population, States and Territories . Canberra: Australian Bureau of Statistics, 2016. © The Author 2017; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Epidemiology Oxford University Press

The importance of historical residential address information in longitudinal studies using administrative health data

Loading next page...
 
/lp/ou_press/the-importance-of-historical-residential-address-information-in-zrE3gaizdh
Publisher
Oxford University Press
Copyright
© The Author 2017; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association
ISSN
0300-5771
eISSN
1464-3685
D.O.I.
10.1093/ije/dyx156
Publisher site
See Article on Publisher Site

Abstract

Abstract Background When information on changes in address or migration of people to or from a study jurisdiction is unavailable in longitudinal studies, issues relating to loss-to-follow-up and misclassification bias may result. This study investigated how estimations of associations between general practitioner (GP) contact and hospital use were affected by incomplete address and migration data. Methods This was a retrospective population-based cohort study of Western Australians from 1990 to 2004. Linked administrative data including mortality records, hospital admissions, primary care and Electoral Roll records were used. Regularity of GP contact, based on the variance of the number of days between GP visits, was calculated for each person-year. Outcomes were the number and costs (A$2014) of diabetes-related hospital admissions in the following year. Models were estimated separately for cohorts where (i) postcode was ascertained at study commencement and held constant, and (ii) postcode and residency within Western Australia were updated with each change of address recorded on the Electoral Roll over the study period. Results Updating address data reduced total person-years by 11% and changed the distribution of covariates. Estimations of associations between patterns of GP contact and number of hospitalizations changed; the incidence rate ratios measuring the relationship with the most regular GP contact (baseline of those with <2 GP visits) changed from 0.81 [95% confidence interval (CI) 0.66–1.00] to 0.42 (95% CI 0.33–0.53) after updating postcode information. Impacts on cost models were smaller, though still statistically significant. Conclusions Longitudinal studies using administrative data may report biased results if they ignore address changes and migration. Researchers should attempt to link to these data wherever possible, or choose study designs which these issues are less likely to affect. Custodians should be aware that such data can be vital to high quality research. Administrative data, time at risk, hospital use, dynamic cohort, migration Key Messages Studies relying on administrative data collections can suffer from missing information on the migration of cohort members, which may lead to incorrect calculations of time at risk and may bias study findings. We estimated associations between general practitioner access and hospital use in a cohort where migration was uncaptured, compared with the same cohort with complete information on migration within and outside the study jurisdiction. Measures of association were biased in the cohort without migration information. Estimates of the rate of hospitalizations were affected to a greater extent than estimates of hospitalization costs. Researchers using administrative data should access information on changes in address, especially migration out of the jurisdiction from which the health service data are available, wherever possible. Introduction Recent decades have seen a dramatic rise in the amount of electronic data recorded in the delivery of health care. Although not captured specifically for research purposes, these ‘administrative’ data are frequently used for research, facilitated in part by advances in computing power.1–3 Though studies using such data can have many strengths, a common issue is that information pertinent to the research question may be missing from administrative collections.4 The issue at the centre of this paper is migration of study members out of the study jurisdiction(s). This migration is not captured in most administrative data collections and may lead to a form of misclassification bias. Where a study population or data collection is defined by residence within a particular jurisdiction, uncaptured migration, whether to or from another state, province or country, can lead to migrants being erroneously assumed as having no health service contacts or outcomes of interest before immigration or following emigration. This will overestimate the size of the cohort/time at risk and potentially lead to inaccurate estimations of the rates, odds or hazards of events.4,5 Furthermore, where geographically derived measures of remoteness or socioeconomic status (SES) are used, uncaptured migration can lead to misclassification of these potential confounders. For these reasons, researchers often use external population data sources, such as census data, to complement health service datasets.4,6 In studies in Australia, the Electoral Roll, maintained by the Australian Electoral Commission (AEC), is frequently used either in defining a cohort or to identify control populations.7 As voting is compulsory for citizens aged 18 and over, the Electoral Roll provides almost comprehensive population data8 incorporating gender, birthdate and residential location; furthermore, address changes are actively captured (including emigration).9,10 Representativeness is diminished by under-representation of Indigenous Australians and younger adults and, furthermore, non-citizens are not captured.10 Most countries that have mature linked administrative data systems also have good quality general population data derived from Census, Electoral Roll or other national registration systems.4,11–17 However, these data are not always available for linkage, and anecdotally, studies incorporating these data generally limit their use to the definition of a study population, identification of matched controls or capture of baseline characteristics. In Australia, the AEC may release data to researchers under the Commonwealth Electoral Act 1918;18 however, this is usually only a single cross-sectional snapshot for use in defining a cohort of interest or matched controls. The rationale behind this restriction is unclear; however, it is unlikely to be for legislative reasons since precedents exist for the release of more detailed data.10 The aim of the current paper is to demonstrate the utility of comprehensive longitudinal Electoral Roll data to more accurately determine person-time at risk, and also confounding due to geographically defined SES and access to health services. This is illustrated through the evaluation of the impact of regularity of general practitioner (GP) contact on numbers and costs of potentially avoidable hospitalizations. Regular GP contact is hypothesized to improve health in chronic disease in part by improving compliance, and previously has been demonstrated to be associated with reduced risk of hospitalization and death in certain groups.19,20 Methods This was a retrospective population-based cohort study of Western Australian (WA) electors registered with Medicare (Australia’s universal public health system) between 1 July 1990 and 30 June 2004. Individual service records were obtained from the WA Mortality Register (1990 to 2004), WA Hospital Morbidity Data Collection (HMDC) (1980 to 2004) and Medicare Benefits Scheme (MBS) claims database (1984 to 2004). The WA Mortality Register details all deaths registered in WA; the HMDC includes all separation records for all hospitals in WA;21 and the MBS records all services attracting reimbursement through Medicare,22 which includes all GP services. We also obtained WA Electoral Roll data (1988 to 2004) including the first registration, and all records where address information was updated incorporating the reason for, and date of, the update. Data were linked by the WA Data Linkage Branch using probabilistic matching previously demonstrated to have low error rates, of 0.11% for both false-positives and false-negatives.23 Cohort membership Two cohorts were constructed: a fixed cohort, which people entered at the start of the first financial year following both (a) the date of their first Electoral Roll record and (b) their first indication of diabetes risk (defined in Table 1). Cohort exit was at the end of the last financial year preceding death or the study end date. Entry and exit dates follow the Australian financial year which runs from 01 July–30 June. This was termed the ‘Non-Historical Electoral Roll cohort’; a dynamic cohort where entry and final exit were as above but where periods of temporary exit and re-entry to the study cohort were captured via the updated address records through the Electoral Roll. This cohort was termed the ‘Historical Electoral Roll cohort’. Table 1. Cohorts of confirmed diabetics, people at high risk of diabetes and people with other cardiovascular risk factors as identified through Hospital Morbidity Data Collection (HMDC) and Medicare Benefits Scheme (MBS) datasets Data source  People with diabetes  People at high risk of diabetes  People with cardiovascular risk factors  HMDC  Diagnosis of diabetes in any diagnosis field on any hospital record, using the International Classification of Diseases 9th Revision, Clinical Modification (ICD-9-CM) codes 250.xx, and the International Classification of Diseases 10th Revision, Australian Modification (ICD-10-AM) codes E10.xx, E11.xx, E13.xx or E14.xx (where xx denotes any digit between 0 and 9), or  Diagnosis of impaired glucose function in any hospital record (ICD-9-CM code 790.2, ICD-10-AM codes E09.xx, R73.xx or O24.5), or A diagnosis of obesity on any hospital record (ICD-9-CM codes 278.0, 278.1, ICD-10-AM codes E65, E66.xx) in addition to being Indigenous and aged 45 or over, or  Diagnosis of ischaemic heart disease or hypertension in any hospital record (ICD-9-CM codes 401.xx -405.xx, 410.xx-415.xx, ICD-10-AM codes I10.xx-I15.xx, I20.xx-I25.xx), or  MBS  Diabetes cycle of care consultation (MBS items 2517, 2518, 2521, 2522, 2525, 2526, 2620, 2622, 2624, 2631, 2633, 2635), or Glycated haemoglobin (HbA1c) quantitation twice within any 6-month period (MBS items 66551, 66319, 66320, 2043, 2044, 1313, 1314), or Quantitation of fructosamine for management of established diabetes (MBS item 66557), or Dietetics, education or exercise physiology service for people with type 2 diabetes (MBS items 81100, 81105, 81110, 81115, 81120, 81125)  MBS record indicating use of the oral glucose tolerance test (OGTT) outside pregnancy (MBS items 66542, 66419), or MBS record indicating HbA1c quantitation (once within a 6-month period)  MBS record indicating coronary arteriography or angiocardiography (MBS items 59912, 59925, 59971, 59972, 59974), or MBS record indicating a myocardial perfusion study (MBS items 61302, 61303, 61306, 61307, 61310, 61651, 61652, 61653, 61654)  Data source  People with diabetes  People at high risk of diabetes  People with cardiovascular risk factors  HMDC  Diagnosis of diabetes in any diagnosis field on any hospital record, using the International Classification of Diseases 9th Revision, Clinical Modification (ICD-9-CM) codes 250.xx, and the International Classification of Diseases 10th Revision, Australian Modification (ICD-10-AM) codes E10.xx, E11.xx, E13.xx or E14.xx (where xx denotes any digit between 0 and 9), or  Diagnosis of impaired glucose function in any hospital record (ICD-9-CM code 790.2, ICD-10-AM codes E09.xx, R73.xx or O24.5), or A diagnosis of obesity on any hospital record (ICD-9-CM codes 278.0, 278.1, ICD-10-AM codes E65, E66.xx) in addition to being Indigenous and aged 45 or over, or  Diagnosis of ischaemic heart disease or hypertension in any hospital record (ICD-9-CM codes 401.xx -405.xx, 410.xx-415.xx, ICD-10-AM codes I10.xx-I15.xx, I20.xx-I25.xx), or  MBS  Diabetes cycle of care consultation (MBS items 2517, 2518, 2521, 2522, 2525, 2526, 2620, 2622, 2624, 2631, 2633, 2635), or Glycated haemoglobin (HbA1c) quantitation twice within any 6-month period (MBS items 66551, 66319, 66320, 2043, 2044, 1313, 1314), or Quantitation of fructosamine for management of established diabetes (MBS item 66557), or Dietetics, education or exercise physiology service for people with type 2 diabetes (MBS items 81100, 81105, 81110, 81115, 81120, 81125)  MBS record indicating use of the oral glucose tolerance test (OGTT) outside pregnancy (MBS items 66542, 66419), or MBS record indicating HbA1c quantitation (once within a 6-month period)  MBS record indicating coronary arteriography or angiocardiography (MBS items 59912, 59925, 59971, 59972, 59974), or MBS record indicating a myocardial perfusion study (MBS items 61302, 61303, 61306, 61307, 61310, 61651, 61652, 61653, 61654)  Eligibility for analysis The setting for this paper was the analysis of the association between regularity of general practitioner (GP) contact in 1 year (the exposure year), and diabetes-related hospitalization in the following year (the outcome year), with the unit of analysis being a series of pairs (or couplets) of financial years. Each person-year contributes to two couplets, as an exposure and outcome year, except the first and last years for which an individual was in the study (and years immediately following re-entry to the study area/before leaving the study area in the case of the Historical Electoral Roll cohort), which count only as an exposure and an outcome year, respectively. Individuals in the Historical Electoral Roll cohort were only deemed eligible in the financial years during which they resided in WA for a full 365 days in 2 consecutive years, as both the exposure and outcome of interest were calculated over a 1-year period. The Non-Historical cohort were eligible from study entry until death or the study end date; reflecting the methods that this study would have followed if only the typical Electoral Roll data were available. The Historical Electoral Roll cohort is therefore a subset of the Non-Historical Electoral Roll cohort, with the additional address information used to exclude ineligible person-years (which may result in the exclusion of an individual entirely). Contact with a GP Regularity of contact with a GP was defined for each financial year as the variance in the number of days between GP visits and has been described previously.24 Briefly, for each GP visit within the exposure period (in this case each financial year), the number of days since the previous GP visit (which may have been in the preceding year) was calculated. Based on these values, an annual regularity index (R) was constructed using the formula R = 1/(1 + Variance(Days). This resulted in a score between 0 and 1 for each person-year, with a score of 1 indicating perfectly regular primary care contact and a score of 0 indicating perfectly irregular primary care contact. The index requires a minimum of two services within a financial year to calculate; people were given a score of 0 for years in which they had fewer than two services. For analysis, regularity was transformed into an ordinal variable by first logarithmically transforming the score (due to its highly skewed nature) and then creating quintiles of regularity (quintile 1 representing the lowest 20% of the logarithmic regularity distribution and quintile 5 the highest 20%). Years with fewer than two GP visits were used as the reference category in all analyses and are referred to as having ‘minimal GP contact’. Note that regularity is independent of frequency of GP attendance (i.e. number of GP visits in the exposure period); frequency was captured as a count and controlled for separately in the analysis. Outcomes under study Hospitalizations where diabetes is a risk factor were identified from Davis et al.25 and used to identify diabetes-related hospitalizations initiated in each financial year. The outcomes of interest were the number and cumulated cost of diabetes-related hospital admissions within a financial year. Where inter-hospital transfers had taken place this was regarded as a single episode of care, to prevent over-counting of admissions. The cost of each episode of care was assigned based on the Australian Refined Diagnosis Related Group (AR-DRG) code recorded across all records included in the episode. The average cost for the specific AR-DRG was used to assign the cost for the episode of care as reported in the National Hospital Cost Data Collections for Western Australia,26 specific to the date of separation of each hospital record. Estimated costs were inflated to 2014 levels using the Australian Consumer Price Index. Other covariates Sex and age were determined from Electoral Roll records. Both SES using the Socio-economic Index for Areas (SIEFA) index of relative social disadvantage (IRSD),27 and accessibility to services using the Accessibility and Remoteness Index of Australia,28 were ascertained using postcode on the Electoral Roll. For the Non-Historical Electoral Roll cohort, these were ascertained at study entry and remained constant. For the Historical Electoral Roll cohort, these were updated for each change in postcode. Comorbidity was applied based on the Multipurpose Australian Comorbidity Scoring System (MACSS).29 Comorbidity was defined as having a MACCS condition, excluding diabetes, on any HMDC diagnosis field in the previous 5 years, for each study year. For each financial year, individuals were classified into one of three mutually exclusive risk groups (people with diabetes, or at high risk of diabetes or with cardiovascular risk factors) using information from their MBS and HMDC records. The criteria used to determine the three risk groups were developed in consultation with our clinical steering panel and are shown in Table 1. Individuals entered a risk group on the day they met the criteria for that group, and exited either upon moving into higher risk group or upon death. The groups are considered hierarchical; people with criteria for both ‘high risk of diabetes’ and ‘cardiovascular risk factors’ are allocated to the high-risk group. Statistical analysis Descriptive statistics were generated to evaluate the differences between the two cohorts with a focus on: (i) the number of individuals; (ii) the number of records (exposure-outcome couplets of analysis); and (iii) changes in residential location. To determine the effect of any differences between the cohorts on the estimated association between regularity of GP contact and subsequent diabetes-related hospitalizations, multivariable regression was undertaken for both cohorts for each era. Negative binomial regression was used for the number of hospitalizations using the number of days out-of-hospital in each outcome year as the exposure and generalized linear models with a gamma distribution and identity link for cost. Clustered-robust standard errors were used to account for intra-person correlation due to the repeated nature of the observations. Confounding was limited using covariate adjustment by means of a generalized propensity score (GPS) suitable for non-binary exposures.30 The GPS was constructed using the time-invariant covariates sex and Indigenous status and the time-varying covariates age, financial year of exposure, frequency of GP visits (in exposure year) and comorbidity risk group. Models were used to estimate the marginal effects of minimal GP contact (i.e. no regularity score) and each quintile of regularity at the mean of the GPS in each era for each cohort. Marginal means with 95% confidence intervals were plotted. For each quintile in each era, the marginal means, standard errors of the mean and group sizes were used to test the significance of differences between the two cohorts via unpaired t tests. Analyses were undertaken separately across two eras: Era 1 from 1990/1991 to 1998/1999 and Era 2 from 1999/2000 to 2002/2003, representing shifts in primary care policy in Australia, specifically the introduction of the Enhanced Primary Care programme. This programme provided reimbursement to GP’s for performing health assessments, care planning and case conferencing to certain patient groups, with the aim of improving the co-ordination of care.31 Note that although the study period begins in 1990/91, earlier data are used to ascertain comorbid status (which for each exposure year has a 5-year look-back) and regularity (which takes into account the date of the latest GP visit preceding each exposure year). Results As shown in Table 2, 20 261 individuals eligible for the study in the Non-Historical Electoral Roll cohort were ineligible in the Historical Electoral Roll cohort as they did not have any eligible couplets of financial years spent entirely in WA. In addition to reducing the cohort size by 6.7%, this also caused small but statistically significant changes across all factors except gender. The differences in how the cohorts were followed up (i.e. whether the cohort was fixed or dynamic) impacted on the number of records eligible for analysis. As shown in Table 3, there were 211 744 records included in the Non-Historical Electoral Roll cohort which were excluded from the Historical Electoral Roll cohort, and the characteristics of these records showed small differences on all factors evaluated. The largest differences were observed in SES and accessibility to services, where the proportions of records with ‘unknown’ status reduced from above 6% to <1% whereas, on all other variables, changes in the relative size of each category were less than 1%. Table 2. Differences in the characteristics of individuals at entry to the Non-Historical and Historical Electoral Roll cohorts Status of Individuals in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Age at cohort entry (years)d  Mean  58.63    59.18    0.55    <0.001    SD  15.21    14.85    −0.36      Gender  Female  150998  50.1  140424  49.9  −10574  −7.0  0.242    Male  150652  49.9  140965  50.1  −9687  −6.4  Indigenous status  No  274170  90.9  260369  92.5  −13801  −5.0  <0.001    Yes  10281  3.4  9201  3.3  −1080  −10.5    Unknown  17199  5.7  11819  4.2  −5380  −31.3  Risk status at cohort entry  Confirmed diabetes  41790  13.9  39926  14.2  −1864  −4.5  <0.001    High risk of diabetes  104492  34.6  93226  33.1  −11266  −10.8    Cardiovascular risk factors  155368  51.5  148237  52.7  −7131  −4.6  Calendar time period of cohort entry  Era 1 (1990/91–1998/99)  213150  70.7  200747  71.3  −12403  −5.8  <0.001    Era 2 (1999/2000–2002/03)  88500  29.3  80642  28.7  −7858  −8.9  Died during study period  No  241329  80.0  224134  79.7  −17195  −7.1  0.001    Yes  60321  20.0  57255  20.3  −3066  −5.1  SEIFA IRSDa quintile at entry into cohort  Highest disadvantage  52217  17.3  50047  17.8  −2170  −4.2    High disadvantage  79480  26.3  76892  27.3  −2588  −3.3    Moderate disadvantage  43205  14.3  41681  14.8  −1524  −3.5  0.793 (excluding unknown)  Less disadvantage  43485  14.4  42097  15.0  −1388  −3.2    Least disadvantage  70607  23.4  68204  24.2  −2403  −3.4    Unknown  12656  4.2  2468  0.9  −10188  −80.5  <0.001 (including unknown)  Accessibility to services (ARIA)b at entry into cohort  Very remote  9378  3.1  8682  3.1  −696  −7.4    Remote  5143  1.7  4972  1.8  −171  −3.3    Moderately accessible  16164  5.4  15714  5.6  −450  −2.8  0.052 (excl unknown)  Accessible  16865  5.6  16469  5.9  −396  −2.3  Highly accessible  241573  80.1  233210  82.9  −8363  −3.5  Unknown  12527  4.2  2342  0.8  −10185  −81.3  <0.001 (incl unknown)  Total number of individuals in the cohort  301650    281389    −20261  −6.7    Status of Individuals in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Age at cohort entry (years)d  Mean  58.63    59.18    0.55    <0.001    SD  15.21    14.85    −0.36      Gender  Female  150998  50.1  140424  49.9  −10574  −7.0  0.242    Male  150652  49.9  140965  50.1  −9687  −6.4  Indigenous status  No  274170  90.9  260369  92.5  −13801  −5.0  <0.001    Yes  10281  3.4  9201  3.3  −1080  −10.5    Unknown  17199  5.7  11819  4.2  −5380  −31.3  Risk status at cohort entry  Confirmed diabetes  41790  13.9  39926  14.2  −1864  −4.5  <0.001    High risk of diabetes  104492  34.6  93226  33.1  −11266  −10.8    Cardiovascular risk factors  155368  51.5  148237  52.7  −7131  −4.6  Calendar time period of cohort entry  Era 1 (1990/91–1998/99)  213150  70.7  200747  71.3  −12403  −5.8  <0.001    Era 2 (1999/2000–2002/03)  88500  29.3  80642  28.7  −7858  −8.9  Died during study period  No  241329  80.0  224134  79.7  −17195  −7.1  0.001    Yes  60321  20.0  57255  20.3  −3066  −5.1  SEIFA IRSDa quintile at entry into cohort  Highest disadvantage  52217  17.3  50047  17.8  −2170  −4.2    High disadvantage  79480  26.3  76892  27.3  −2588  −3.3    Moderate disadvantage  43205  14.3  41681  14.8  −1524  −3.5  0.793 (excluding unknown)  Less disadvantage  43485  14.4  42097  15.0  −1388  −3.2    Least disadvantage  70607  23.4  68204  24.2  −2403  −3.4    Unknown  12656  4.2  2468  0.9  −10188  −80.5  <0.001 (including unknown)  Accessibility to services (ARIA)b at entry into cohort  Very remote  9378  3.1  8682  3.1  −696  −7.4    Remote  5143  1.7  4972  1.8  −171  −3.3    Moderately accessible  16164  5.4  15714  5.6  −450  −2.8  0.052 (excl unknown)  Accessible  16865  5.6  16469  5.9  −396  −2.3  Highly accessible  241573  80.1  233210  82.9  −8363  −3.5  Unknown  12527  4.2  2342  0.8  −10185  −81.3  <0.001 (incl unknown)  Total number of individuals in the cohort  301650    281389    −20261  −6.7    aSocioeconomic index for Areas, Index of Relative Disadvantage. bAccessibility and Remoteness Index for Australia. cSignificance of difference in the relative number of individuals across categories between Non-Historical and Historical Electoral roll cohorts undertaken using chi square testing unless otherwise stated. dSignificance of difference in mean of age undertaken using two-sided t test. Table 3. Differences in the characteristics of records (couplets of exposure and outcome years) captured by the Non-Historical and Historical Electoral Roll cohorts Status of records in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Gender  Female  965810  50.3  856969  50.2  −108841  −11.3  0.029    Male  954682  49.7  851779  49.8  −102903  −10.8  Indigenous status  No  1779563  92.7  1611180  94.3  −168383  −9.5  <0.001    Yes  66625  3.5  52365  3.1  −14260  −21.4    Unknown  74304  3.9  45203  2.6  −29101  −39.2  Risk status  Confirmed diabetic  478965  24.9  420893  24.6  −58072  −12.1  <0.001    High risk of diabetes  426121  22.2  366542  21.5  −59579  −14.0    Cardiovascular risk factors  1015406  52.9  921313  53.9  −94093  −9.3  SEIFA IRSDa quintile  Highest disadvantage  329995  17.2  308175  18.0  −21820  −6.6  <0.001  (including and excluding unknown)    High disadvantage  499152  26.0  473145  27.7  −26007  −5.2    Moderate disadvantage  266558  13.9  251966  14.7  −14592  −5.5    Less disadvantage  277003  14.4  261501  15.3  −15502  −5.6    Least disadvantage  427909  22.3  404877  23.7  −23032  −5.4    Unknown  119875  6.2  9084  0.5  −110791  −92.4  Accessibility to services (ARIA)b  Very remote  50333  2.6  45175  2.6  −5158  −10.2  <0.001  (including and excluding unknown)    Remote  32382  1.7  30474  1.8  −1908  −5.9    Moderately accessible  99184  5.2  93730  5.5  −5454  −5.5    Accessible  105445  5.5  100362  5.9  −5083  −4.8    Highly accessible  1513839  78.8  1430461  83.7  −83378  −5.5    Unknown  119309  6.2  8546  0.5  −110763  −92.8  Era in which services were provided  Era 1 (1990/91–1998/99)  1051127  54.7  943226  55.2  −107901  −10.3  <0.001  Era 2 (1999/00–2002/03)  869365  45.3  765522  44.8  −103843  −11.9  Number of GP visits in exposure year  Less than two GP visits  252939  13.2  207785  12.2  −45154  −17.9  <0.001  Two or more GP visits  1667553  86.8  1500963  87.8  −166590  −10.0  Total number of records captured     1920492    1708748    −211744  −11.0    Status of records in cohort  Non-Historical Electoral Roll   Historical Electoral Roll   Historical vs Non-Historical   Signc       n  %  n  %  Difference n  Difference %  P-value  Gender  Female  965810  50.3  856969  50.2  −108841  −11.3  0.029    Male  954682  49.7  851779  49.8  −102903  −10.8  Indigenous status  No  1779563  92.7  1611180  94.3  −168383  −9.5  <0.001    Yes  66625  3.5  52365  3.1  −14260  −21.4    Unknown  74304  3.9  45203  2.6  −29101  −39.2  Risk status  Confirmed diabetic  478965  24.9  420893  24.6  −58072  −12.1  <0.001    High risk of diabetes  426121  22.2  366542  21.5  −59579  −14.0    Cardiovascular risk factors  1015406  52.9  921313  53.9  −94093  −9.3  SEIFA IRSDa quintile  Highest disadvantage  329995  17.2  308175  18.0  −21820  −6.6  <0.001  (including and excluding unknown)    High disadvantage  499152  26.0  473145  27.7  −26007  −5.2    Moderate disadvantage  266558  13.9  251966  14.7  −14592  −5.5    Less disadvantage  277003  14.4  261501  15.3  −15502  −5.6    Least disadvantage  427909  22.3  404877  23.7  −23032  −5.4    Unknown  119875  6.2  9084  0.5  −110791  −92.4  Accessibility to services (ARIA)b  Very remote  50333  2.6  45175  2.6  −5158  −10.2  <0.001  (including and excluding unknown)    Remote  32382  1.7  30474  1.8  −1908  −5.9    Moderately accessible  99184  5.2  93730  5.5  −5454  −5.5    Accessible  105445  5.5  100362  5.9  −5083  −4.8    Highly accessible  1513839  78.8  1430461  83.7  −83378  −5.5    Unknown  119309  6.2  8546  0.5  −110763  −92.8  Era in which services were provided  Era 1 (1990/91–1998/99)  1051127  54.7  943226  55.2  −107901  −10.3  <0.001  Era 2 (1999/00–2002/03)  869365  45.3  765522  44.8  −103843  −11.9  Number of GP visits in exposure year  Less than two GP visits  252939  13.2  207785  12.2  −45154  −17.9  <0.001  Two or more GP visits  1667553  86.8  1500963  87.8  −166590  −10.0  Total number of records captured     1920492    1708748    −211744  −11.0    aSocioeconomic index for Areas, Index of Relative Disadvantage. bAccessibility and Remoteness Index for Australia. cSignificance of difference in the relative number of records across categories between the Non-Historical and Historical Electoral Roll cohorts undertaken using chi square testing unless otherwise stated. Figure 1 shows that only 57% of individuals did not change their postcode during the study period and thus would have had their postcode correctly identified for their entire follow up with Non-Historical Electoral Roll data. The impact on SES was less dramatic, with only 15% of individuals having one change in SES and less than 5% having two or more. Accessibility to services was more stable, with only 4% having one change in accessibility and less than 1% having two or more changes. Figure 1 View largeDownload slide The number and percentage of cohort members with changes in location information during the study period according to Electoral Roll records. Figure 1 View largeDownload slide The number and percentage of cohort members with changes in location information during the study period according to Electoral Roll records. Regularity of GP contact was closely related to the volume of GP contact; in both the historical and non-historical cohorts, those in the lowest quintile had a mean of 4.4 visits [standard deviation (SD) 2.6] compared with the highest quintile with 21.0 (SD 14.5). Table 4 shows the results of modelling the association between regularity of GP access and diabetic-related hospitalization in the following year for both cohorts. Table 4A presents the magnitude of relative effect of regularity on rates of diabetic-related hospitalizations, derived from negative binomial regression models. The relative effect of regularity was greater for the Historical Electoral Roll cohort, particularly in the highest quintile of regularity where the relative effect was nearly doubled [incidence rate ratio (IRR) in Era 1: 0.49, 95% CI 0.40 to 0.60, in Era 2: 0.42, 95% CI 0.33 to 0.53] compared with that observed for the Non-Historical Electoral Roll cohort (Era 1: 0.80, 95% CI 0.67 to 0.97, in Era 2: 0.81, 95% CI 0.66 to 1.00). The predicted rate of diabetic related hospitalization from the models are shown in Figure 2 (A and B), showing that at minimal GP contact or low regularity, the Non-Historical Electoral Roll cohort underestimated the rate of diabetic-related hospitalization, whereas at higher regularity, it tended to be overestimated. Differences in marginal means between the Historical Electoral Roll cohort and Non-Historical Electoral Roll cohorts were significant for the groups with minimal GP visits and highest regularity in both Eras, and for those with the lowest regularity in the first Era only. Table 4. The impact of Historical Electoral Roll data on the relationship between regularity of attendance with a general practitioner and (A) the rate of diabetic-related hospitalization in the following year, from negative binomial regression models, and (B) the cost (A$ 2014) of diabetic-related hospitalization in the following year, from generalised linear regression models   Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    A. The rate of diabetic related hospitalization in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  0.65  0.57  0.75  <0.001  0.55  0.47  0.64  <0.001  0.61  0.52  0.71  <0.001  0.43  0.36  0.51  <0.001  Quintile 2  0.50  0.43  0.59  <0.001  0.37  0.32  0.44  <0.001  0.52  0.44  0.62  <0.001  0.35  0.29  0.43  <0.001  Quintile 3  0.56  0.47  0.65  <0.001  0.39  0.33  0.47  <0.001  0.67  0.56  0.81  <0.001  0.41  0.33  0.50  <0.001  Quintile 4  0.67  0.57  0.80  <0.001  0.43  0.35  0.52  <0.001  0.67  0.55  0.81  <0.001  0.37  0.30  0.46  <0.001  Highest quintile of regularity  0.80  0.67  0.97  0.021  0.49  0.40  0.60  <0.001  0.81  0.66  1.00  0.0550  0.42  0.33  0.53  <0.001  Generalized propensity score  0.70  0.56  0.86  0.001  1.40  1.12  1.76  0.004  1.33  1.02  1.74  0.0350  3.11  2.31  4.18  <0.001  Constant  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001      Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  IRRa  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    A. The rate of diabetic related hospitalization in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  0.65  0.57  0.75  <0.001  0.55  0.47  0.64  <0.001  0.61  0.52  0.71  <0.001  0.43  0.36  0.51  <0.001  Quintile 2  0.50  0.43  0.59  <0.001  0.37  0.32  0.44  <0.001  0.52  0.44  0.62  <0.001  0.35  0.29  0.43  <0.001  Quintile 3  0.56  0.47  0.65  <0.001  0.39  0.33  0.47  <0.001  0.67  0.56  0.81  <0.001  0.41  0.33  0.50  <0.001  Quintile 4  0.67  0.57  0.80  <0.001  0.43  0.35  0.52  <0.001  0.67  0.55  0.81  <0.001  0.37  0.30  0.46  <0.001  Highest quintile of regularity  0.80  0.67  0.97  0.021  0.49  0.40  0.60  <0.001  0.81  0.66  1.00  0.0550  0.42  0.33  0.53  <0.001  Generalized propensity score  0.70  0.56  0.86  0.001  1.40  1.12  1.76  0.004  1.33  1.02  1.74  0.0350  3.11  2.31  4.18  <0.001  Constant  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001  0.00  0.00  0.00  <0.001      Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    B. The cost (A$ 2014) of diabetic-related hospitalizations in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  −820.6  −973.0  −668.2  <0.001  −1097.3  −1263.2  −931.5  <0.001  −645.3  −710.6  −580.0  <0.001  −921.3  −998.7  −843.8  <0.001  Quintile 2  −1090.3  −1242.5  −938.1  <0.001  −1414.6  −1581.1  −1248.1  <0.001  −722.9  −794.1  −651.6  <0.001  −1047.8  −1130.8  −964.7  <0.001  Quintile 3  −984.8  −1142.9  −826.8  <0.001  −1350.6  −1520.8  −1180.3  <0.001  −628.6  −708.7  −548.6  <0.001  −950.0  −1041.4  −858.5  <0.001  Quintile 4  −861.7  −1033.3  −690.2  <0.001  −1287.0  −1467.5  −1106.4  <0.001  −572.4  −662.0  −482.9  <0.001  −949.1  −1047.3  −851.0  <0.001  Highest quintile of regularity  −593.7  −772.5  −414.9  <0.001  −1046.7  −1235.1  −858.3  <0.001  −293.8  −403.5  −184.1  <0.001  −701.4  −822.7  −580.1  <0.001  Generalized propensity score  617.3  473.8  760.8  <0.001  1113.0  974.5  1251.4  <0.001  951.0  834.9  1067.0  <0.001  1308.2  1192.0  1424.4  <0.001  Constant  1803.9  1653.4  1954.3  <0.001  1928.7  1766.3  2091.0  <0.001  1060.0  996.5  1123.5  <0.001  1245.8  1172.0  1319.6  <0.001      Non-Historical Eroll Era 1   Historical Eroll Era 1   Non-Historical Eroll Era 2   Historical Eroll Era 2   Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Coefd  95% CI   P-value  Lower  Upper  Lower  Upper  Lower  Upper  Lower  Upper    B. The cost (A$ 2014) of diabetic-related hospitalizations in the following yearb   Minimal GP contactc  Reference    Reference    Reference    Reference    Lowest quintile of regularity  −820.6  −973.0  −668.2  <0.001  −1097.3  −1263.2  −931.5  <0.001  −645.3  −710.6  −580.0  <0.001  −921.3  −998.7  −843.8  <0.001  Quintile 2  −1090.3  −1242.5  −938.1  <0.001  −1414.6  −1581.1  −1248.1  <0.001  −722.9  −794.1  −651.6  <0.001  −1047.8  −1130.8  −964.7  <0.001  Quintile 3  −984.8  −1142.9  −826.8  <0.001  −1350.6  −1520.8  −1180.3  <0.001  −628.6  −708.7  −548.6  <0.001  −950.0  −1041.4  −858.5  <0.001  Quintile 4  −861.7  −1033.3  −690.2  <0.001  −1287.0  −1467.5  −1106.4  <0.001  −572.4  −662.0  −482.9  <0.001  −949.1  −1047.3  −851.0  <0.001  Highest quintile of regularity  −593.7  −772.5  −414.9  <0.001  −1046.7  −1235.1  −858.3  <0.001  −293.8  −403.5  −184.1  <0.001  −701.4  −822.7  −580.1  <0.001  Generalized propensity score  617.3  473.8  760.8  <0.001  1113.0  974.5  1251.4  <0.001  951.0  834.9  1067.0  <0.001  1308.2  1192.0  1424.4  <0.001  Constant  1803.9  1653.4  1954.3  <0.001  1928.7  1766.3  2091.0  <0.001  1060.0  996.5  1123.5  <0.001  1245.8  1172.0  1319.6  <0.001    Era is the calendar time period the exposure was ascertained (Era 1 (1990/91–1998/99) and Era 2 (1999/2000–2002/03). aIncidence rate ratio: ratio of the incidence rate of diabetic-related hospitalizations in the following year for each scenario compared with the incidence rate in the reference scenario. bAll analyses control for confounding using a generalized propensity score constructed including gender, age, indigenous status, year of exposure, frequency of general practitioner contact, comorbidities, diabetes risk level, socioeconomic disadvantage and service accessibility. cReference category is fewer than two GP visits in the exposure year (i.e. no regularity score). dCoefficient: difference in the mean cost (A$ 2014) of diabetic-related hospitalization in the following year compared with that of the reference scenario. Figure 2 View largeDownload slide Predicted rates and costs of diabetic related hospitalisations by quintile of regularity, Historical and Non-Historical Electoral Roll cohorts presented separately. Figure 2 View largeDownload slide Predicted rates and costs of diabetic related hospitalisations by quintile of regularity, Historical and Non-Historical Electoral Roll cohorts presented separately. Table 4 (B) and Figure 2 (C and D) show results for the evaluation of the association of regularity and cost of diabetic hospitalization using generalized linear models, showing similar results. The models show a larger relative effect size for the Historical Electoral Roll cohort. The coefficient for the highest quintile of regularity in Era 1 was −1046.7 (95% CI −1235.1 to −858.3) in the Historical Electoral Roll cohort compared with −593.7 (95% CI −772.5 to −414.9) in the Non-Historical cohort; in Era 2, coefficients were −701.4 (95% CI −822.7 to −580.1) and −293.8 (95% CI −403.5 to −184.4) for the Historical and Non-Historical cohorts, respectively. Figure 2 (C and D), presenting marginal mean costs, indicates that this is largely driven by an underestimation of the cost at minimal GP contact and lower levels of regularity, whereas estimates of mean costs were similar at higher levels of regularity. Differences in marginal means between cohorts were significant for the minimal GP contact group in both Eras and for the least regular quintile in Era 2 only, and were non-significant in other cases. Discussion Summary of findings These results suggest that where longitudinal studies do not follow participants to update residential addresses, results may be biased. In this analysis, when data were updated with historical address information, the cohort size and person-time at risk were reduced by 6.7% and 11.0%, respectively, and changes in SES and remoteness were unrecorded. Effect estimates were also biased; the negative associations between regularity of GP contact and both number and cost of hospitalizations were significantly lower when historical address information was not used. Interpretation of results Records appearing in the Non-Historical Electoral Roll cohort that were not included in the Historical Electoral Roll cohort represent years in which individuals were out of WA or incarcerated for all or part of the year, and therefore had no or limited opportunity to interact with the WA Health system. These records would have the effect of suggesting an association between lower regularity of GP access and lower hospital use. In both cohorts, individuals with minimal GP contact tended to have higher levels of hospital use. The inclusion of the erroneous records in the Non-Historical analysis reduced the magnitude of this association, with the number and cost of hospitalizations underestimated in those with minimal GP contact, and minor differences between estimates at the remaining quintiles of regularity. There are clear policy implications for the publication of such biased results. Whereas the results using Historical Electoral Roll data may provide an incentive for a policy maker to promote GP access or remove barriers to access, in the expectation of reduced demand on hospitals, the publication of biased analyses based on the Non-Historical Electoral Roll data would only result in a diminished incentive for action in this area. Results for hospitalization rates were biased to a greater extent than hospitalization costs, which may be due to the different modelling techniques used. The evaluation of hospitalization rates incorporated person-time at risk as the denominator, a metric significantly changed by the make-up of the two cohorts. The GLM method used for evaluating the cost of hospitalizations does not use a denominator and would not have been as affected by the overestimation in person-time at risk. It is likely that inaccuracy in SES and remoteness in the analysis using Non-Historical Electoral Roll data affected effect estimates, given their known influence on access to services,32–34 but it is unclear how this might have biased findings. Implications and generalizability The literature presents examples of studies which may have been similarly strengthened by the inclusion of historical address information. Robbins et al. identified a low-income population with diabetes from Philadelphia Health Care Centres, to assess racial/ethnic disparities in hospital use.35 The authors acknowledge a lack of data concerning admissions outside Pennsylvania and movements out-of-State as a limitation. Similarly, Bindman et al. examined the impact of breaks in public insurance coverage on the risk of hospitalization for certain conditions in California, and again acknowledge a lack of data on hospitalizations out-of-State as potentially impacting on findings.36 Authors in both papers acknowledge that data on hospitalizations out-of-state could have improved the accuracy of results; information on migration could have been similarly useful by allowing time at risk to be corrected. Approximately 2–3% of West Australian residents emigrate either interstate or overseas each year37,38 (workings not shown), though rates for citizens, as included in this analysis, may differ. Researchers studying populations with lower migration rates (for example due to age or health status) may be less concerned about these issues. Strengths and limitations This study was based on a large dataset including all West Australian electors, and we consider the findings to be generalizable to most developed nations. Aside from the address data used, the two analyses were identical and we can be confident in the internal validity of this work. The data available for this study are from 1990 to 2004, and therefore the association between regularity and hospitalization may have changed more recently. However, describing the contemporaneous relationship was not the purpose of this study and thus the age of the data is immaterial. There are almost certainly incorrect links in the data used, though these are likely rare with the system used to perform the linkages, previously demonstrated to have proportions of both false-positives and false-negatives of 0.11%.23 Although the analysis included here is a useful example of this issue, it does not provide any indication of the magnitude or direction of any bias that may be introduced into other studies which will differ in designs, cohorts, follow-up period and so on. For example, the missing address information might have impacted on this analysis differently had we used a time-to-event methodology or examined a different cohort where migration patterns might have been different. We use an example in which both the exposure (GP contact) and outcome (hospital use) can be affected by the issues described, thus this may be an ‘extreme’ example of how missing historical address data may impact upon results. Furthermore, we cannot estimate the extent to which the difference in results between the two cohorts here arise from incorrect ascertainment of SES/remoteness as opposed to overestimation of time at risk. Where time at risk was overestimated, individuals may have remained in the Historical Electoral Roll Cohort (but contributing fewer person-years), or may have been removed entirely; the analysis here does not differentiate between these processes. Recommendations Based on these findings, we urge the AEC to make historical address data available to approved researchers. Researchers using administrative data in other countries should seek to access similar datasets wherever available. Where such data are unavailable, researchers must take care in the design phase and when interpreting results of their own studies. For example, researchers may choose to limit follow-up periods to minimize the likelihood of migration, may estimate migration rates based on external data on a similar population to the study cohort, or may access data from multiple jurisdictions to improve cross-border capture of outcomes of interest. Conclusion This study has demonstrated that incomplete address data can bias findings in studies relying on administrative data. In this example, the bias introduced through incomplete address information caused a reduction in the apparent magnitude of the association between regular GP access and hospitalization. Although the AEC records changes in address for all electors, researchers are generally only provided with the address at a single point in time. We urge the AEC to provide these data where requested, and encourage researchers outside of Australia to utilise similar data wherever possible. Funding This work was supported by the National Health and Medical Research Council, project grant APP1078345. Conflict of interest: None declared. References 1 Langan SM, Benchimol EI, Guttman A, et al.   Setting the RECORD straight: developing a guideline for the REporting of studies Conducted using Observational Routinely collected Data. Clin Epidemiol  2013; 5: 29– 31. Google Scholar PubMed  2 Hobbs MST, McGall MG. Health statistics and record linkage in Australia. J Chronic Dis  1970; 23: 375– 81. Google Scholar CrossRef Search ADS PubMed  3 Armstrong BK, Kricker A. Record linkage – a vision renewed. Aust N Z J Public Health  1999; 23: 451– 52. Google Scholar CrossRef Search ADS PubMed  4 Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Ann Rev Public Health  2011; 32: 91– 108. Google Scholar CrossRef Search ADS   5 Cameron CM, Purdie DM, Kliewer EV, Wajda A, McClure RJ. Population health and clinical data linkage: the importance of a population registry. Aust N Z J Public Health  2007; 31: 459– 63. Google Scholar CrossRef Search ADS PubMed  6 Sibthorpe B, Kliewer E, Smith L. Record linkage in Australian epidemiological research: health benefits, privacy safeguards and future potential. Aust J Public Health  1995; 19: 250– 56. Google Scholar CrossRef Search ADS PubMed  7 Mitchell RJ, Cameron CM, Bambach MR. Data linkage for injury surveillance and research in Australia: perils, pitfalls and potential. Aust N Z J Public Health  2014; 38: 275– 80. Google Scholar CrossRef Search ADS PubMed  8 Australian Electoral Commission. Enrolment Statistics . Canberra: Australian Electoral Commission, 2016. 9 Australian Electoral Commission. Enrolment—Frequently Asked Questions . Canberra: Australian Electoral Commission, 2016. 10 Moorin RE, Holman CDJ, Garfield C, Brameld KJ. Health related migration: evidence of reduced ‘urban-drift’. Health Place  2006; 12: 131– 40. Google Scholar CrossRef Search ADS PubMed  11 Ludvigsson J, Otterblad-Olausson P, Petterson B, Ekbom A. The Swedish personal identity number: possibilities and pitfalls in healthcare and medical research. Eur J Epidemiol  2009; 24: 659– 67. Google Scholar CrossRef Search ADS PubMed  12 Australian Bureau of Statistics. Census of Population and Housing . Canberra: Australian Bureau of Statistics, 2016. PubMed PubMed  13 Norwegian Tax Administration. National Registry Norway . Oslo: Norwegian Tax Administration. 14 Cameron C, Purdie D, Kliewer E, McClure R. Differences in prevalence of pre-existing morbidity between injured and non-injured populations. Bull World Health Organ  2005; 83: 345– 52. Google Scholar PubMed  15 Australian Electoral Commission. About the Commonwealth Electoral Roll . Canberra: Australian Electoral Commission, 2016. 16 National Records of Scotland. Welcome to Scotland's Census  Edinburgh: National Records of Scotland, 2016. 17 Office for National Statistics. 2011 Census England . Cardiff, UK: Office for National Statistics, 2011. PubMed PubMed  18 Australian Electoral Commission. Supply of Elector Information for Use in Medical Research . Canberra: Australian Electoral Commission, 2016. 19 Einarsdottir K, Preen DB, Emery JD, Holman CDJ. Regular primary care plays a significant role in secondary prevention of ischaemic heart disease in a Western Australian cohort. J Gen Intern Med  2011; 26: 1092– 97. Google Scholar CrossRef Search ADS PubMed  20 Einarsdottir K, Preen DB, Emery JD, Kelman C, Holman CDJ. Regular primary care lowers hospitalization risk and mortality in seniors with chronic respiratory diseases. J Gen Intern Med  2010; 25: 766– 73. Google Scholar CrossRef Search ADS PubMed  21 Data Linkage Western Australia. Core Data Collections . Perth, WA: WA Department of Health, 2017. 22 van Gool K, Parkinson B, Kenny P. Medicare Australia Data for Research: An Introduction . Sydney, NSW: Centre for Health Economics Research and Evaluation, 2011. 23 Holman CDJ, Bass AJ. Population-based linkage of health records in Western Australia: development of a health services research linked database. Aust N Z J Public Health  1999; 23: 453– 59. Google Scholar CrossRef Search ADS PubMed  24 Gibson DAJ, Moorin RE, Preen D, Emery J, Holman CDJ. Enhanced primary care improves GP service regularity in older patients without impacting on service frequency. Aust J Prim Health  2012; 18: 295– 303. Google Scholar PubMed  25 Davis WA, Hendrie DH, Knuiman MW, Davis TME. Determinants of diabetes-attributable non-blood glucose-lowering medication costs in type 2 diabetes. Diabetes Care  2005; 28: 329– 36. Google Scholar CrossRef Search ADS PubMed  26 Commonwealth Department of Health and Ageing. National Hospital Cost Data Collection, Rounds 1–9 . Canberra: Commonwealth Department of Health and Ageing. 27 Australian Bureau of Statisticsr. Census of Population and Housing: Socio Economic Indexes for Areas (2001, 2006 & 2011) . Canberra: Australian Bureau of Statistics, 2001, 2006, 2011. 28 Australian Population and Migration Research Centre. ARIA (Accessibility / Remoteness Index of Australia) . Adelaide, SA: University of Adelaide, 2015. 29 Holman CDJ, Preen DB, Baynham NJ, Finn JC, Semmens JB. A multipurpose comorbidity scoring system performed better than the Charlson index. J Clin Epidemiol  2005; 58: 1006– 14. Google Scholar CrossRef Search ADS PubMed  30 Guardabascio B, Ventura M. Estimating the dose-response function through a generalized linear model approach. Stata J  2014; 14: 141– 58. 31 Blakeman T, Harris M, Comino E, Zwar N. Implementation of the enhanced primary care items requires ongoing education and evaluation. Aust Fam Physician  2001; 30: 75– 77. Google Scholar PubMed  32 Basu J, Friedman B, Burstin H. Primary care, HMO enrollment, and hospitalization for ambulatory care sensitive conditions. Med Care  2002; 40: 1260– 69. Google Scholar CrossRef Search ADS PubMed  33 Billings J, Anderson GM, Newman LS. Recent findings on preventable hospitalizations. Health Aff  1996; 15: 239– 49. Google Scholar CrossRef Search ADS   34 Bindman AB, Grumbach K, Osmond D, et al.   Preventable hospitalizations and access to health care. JAMA  1995; 274: 305– 11. Google Scholar CrossRef Search ADS PubMed  35 Robbins J, Webb DA. Hospital admission rates for a racially diverse low-income cohort of patients with diabetes: The Urban Diabetes Study. Am J Public Health  2006; 96: 1260– 64. Google Scholar CrossRef Search ADS PubMed  36 Bindman AB, Chattopadhyay A, Auerback GM. Interruptions in Medicaid coverage and risk for hospitalization for ambulatory care-sensitive conditions. Ann Intern Med  2008; 149: 854– 60. Google Scholar CrossRef Search ADS PubMed  37 Australian Bureau of Statistics. Migration, Australia 2014–15 . Canberra: Australian Bureau of Statistics, 2016. 38 Australian Bureau of Statistics. Australian Demographic Statistics: Estimated Resident Population, States and Territories . Canberra: Australian Bureau of Statistics, 2016. © The Author 2017; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association

Journal

International Journal of EpidemiologyOxford University Press

Published: Feb 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off