TY - JOUR AU - Gallo,, Giovanni AB - Abstract Background We estimated the cumulative number of people diagnosed with human immunodeficiency virus (HIV) infection in a region of Italy by cross-linking data from four surveillance systems and applying capture-recapture methods. Methods The study was conducted using data referring to residents of the Veneto Region (population 4.4 million). We cross-linked data from the AIDS Registry (data 1983–2000), the HIV Registry (1988–2000), the Death Registry (1992–1999), and the Hospital-Discharge Registry (1997–2000), using a code based on name, birth date, and sex. A specific software for capture-recapture models (CARE-1) was used to estimate the size of the target population with two different statistical approaches (sample coverage and log-linear models). Results A total of 2801 people were reported to the AIDS Registry, 6415 to the HIV Registry, 1598 to the Death Registry as HIV/AIDS-related deaths, and 3330 to the Hospital-Discharge Registry with a diagnosis of HIV infection. Overall, 8723 people were present in at least one registry: 4896 people were present in only one registry, 2387 in two registries, 1286 in three registries, and 154 in all four registries. Using the sample coverage approach, we estimated that, since the beginning of the epidemic in Veneto, an estimated 11 281 people (95% CI: 10 981, 11 621) should have been reported to at least one registry; thus the estimated coverage of the four registries was 77.3% (i.e. 8723/11 281). Results obtained applying the log-linear approach were similar, although the fitting of this model was not adequate. Conclusions Cross-linking data from four different sources and applying the capture-recapture method can improve the accuracy of the estimates of the dimensions of the HIV epidemic. Surveillance systems, HIV, AIDS, causes of deaths, hospital discharges, capture-recapture Before highly active retroviral therapy (HAART) was introduced in 1996, AIDS surveillance was the most common means of monitoring the epidemic of human immunodeficiency virus (HIV) infection in industrialized countries.1–4 However, the slowing down of disease progression induced by HAART5 has complicated the interpretation and use of AIDS data, and asa consequence, the surveillance of new diagnoses of HIV infection has gained importance in the monitoring of the HIV epidemic.6,7 Nonetheless, the data provided by the surveillance of HIV diagnoses have not been thoroughly evaluated for their completeness or validity or with regard to the timeliness of reporting. In a study conducted in the UK in 1994, cases of HIV infection were matched with those of AIDS, and the completeness of HIV reporting was estimated to be 80–84%.8 A recent study conducted in a region of Italy showed that more than 30% of the reported AIDS cases had not been reported to the HIV surveillance system, and the factor most associated with underreporting was the patient’s having discovered his/her seropositivity only upon AIDS diagnosis.9 In a study conducted in Finland, 22% of the AIDS cases in the period 1991–1995 and 40% of the cases in 1996–2000 had not been reported to the HIV surveillance system, and the factor most associated with underreporting was the patient’s recent immigration. In the same study, when the data provided by HIV and AIDS surveillance were cross-linked with data from an antenatal HIV screening programme, an additional 48 HIV-infected women were detected, whereas cross-linking with data from the Institute for Forensic Pathology and from reports of injecting drug users with newly acquired hepatitis A or B virus infection only revealed 5 additional cases.10 To the best of our knowledge, no studies have estimated the unknown proportion of individuals diagnosed with HIV using capture-recapture methodology, which, in the field of epidemiology, is commonly used to estimate the size of a population when data from different sources are available.11 The objective of the present study was to estimate the cumulative number of people diagnosed with HIV infection in a region of Italy by cross-linking data from four different surveillance systems and applying capture-recapture methods. Methods The analyses were conducted on data from people in the Veneto Region (north-east Italy; Italy is divided into 20 geographical regions). The region covers a territory of approximately 18 400 km2 and has a population of more than 4 400 000 inhabitants. The four surveillance systems considered in this study were the Italian National AIDS Registry, the HIV Registry of the Veneto Region, the Registry of Deaths of the Veneto Region, and the Hospital-Discharge Registry of the Veneto Region. The National AIDS Registry Italy’s AIDS Registry1 is similar to those of other European countries.2 The reporting of AIDS cases began in 1982 on a voluntary basis, becoming mandatory in 1987. The Registry is managed by the National AIDS Centre of the ‘Istituto Superiore di Sanità’ (Italy’s National Institute of Health). Initially, the definition used for diagnosing AIDS was that of the US Centers for Disease Control (CDC), yet since January 1993, the European version of the 1993 CDC definition has been used (i.e. clinical but not immunological extension).12 When an individual is diagnosed with AIDS, the diagnosing physician fills out a specific form and sends copies to the National AIDS Centre and the Regional Health Authority. The form contains the individual’s full name, date of birth, sex, nationality, the city/town of residence and of diagnosis, the date of diagnosis, the HIV exposure category, and, since 1996, the date of the first positive HIV-test result; the date that the form is received by the Registry is also reported. For people who have died, the form also contains the date, cause, and place of death, although the reporting of this information is not mandatory. The data used for the present analysis were those updated to 30 June 2001 and referring to people who were diagnosed with AIDS in 1983–2000 and whose reported city/town of residence was located in the Veneto Region. The HIV Registry of the Veneto Region The HIV Registry of the Veneto Region was created in April 1988 (no nation-wide system exists). The Registry collects information on the results of HIV tests, whether positive or negative.13 Every time a person undergoes a serological test at a Local Health Unit, the physician fills out a specific form which contains the results of the test, a code indicating the HIV exposure category, which is determined during pre-test counselling, and an individual identification code which consists of 11 characters, specifically, the first and third letters of the surname, the first and third letters of the first name, the day, the month, and the last two digits of the year of birth, and either ‘M’ for males or ‘F’ for females. This information is then recorded in a computerized database at the Local Health Unit, and every 3 months the data are sent to the Regional Health Authority. HIV testing is performed using a high-sensitivity enzyme-linked immunosorbant assay (ELISA), and reactive sera are confirmed using Western Blot techniques. The data used for the present analysis were those updated to 30 June 2001 and referring to people who were diagnosed with HIV infection in the period 1988–2000. The Registry of Deaths of the Veneto Region Since 1887, the Italian National Census Bureau (ISTAT) has been collecting information on the causes of death of all deceased people in Italy.14 For each death, the doctor who confirms death fill outs a notification form which includes the name, date of birth, sex, and place of residence of the deceased person and the presumed initial, intermediate, and final causes of death, codified using the Ninth and Tenth Revisions of the International Classification of Diseases (ICD-9 and ICD-10). A copy of the form is sent to both ISTAT and the Local Health Unit where the person had lived, which then sends the data to the Regional Health Authority, where it is recorded in a regional database. At ISTAT, the data are also recorded in a database, although only the year of birth, and not the name of the person, is recorded. In 1992, the Regional Health Authority of the Veneto Region began recording the name and place and date of birth contained in these forms in a computerized database. However, from 1992 to 1995, the notification forms were only received from certain areas of the Region, although these data covered approximately 90% of the total deaths in the Region. Since 1996, the data have covered 100% of all those who died and resided in the Region and who resided in the Region yet died elsewhere. The data used in this analysis refer to people who died between 1992 and 1999 and for whom HIV infection or AIDS (i.e. ICD-9: 279.1; or ICD-10: 042–044) was recorded as the initial, intermediate, or final cause of death. Hospital-Discharge Registry of the Veneto Region The Hospital-Discharge Registry of the Veneto Region, created in 1985, records all discharges from all public hospitals and from private hospitals reimbursed by the National Health System located in the Region. Hospital-discharge forms report the principal and secondary (up to three) diagnoses and the ICD-9 codes.15 Until 1997, the form only included the ICD-9 code and no other information on the patient, whereas since 1997 the patient’s first and last name, date of birth, and sex have been recorded. The data on the form is recorded in the Hospital-Discharge Registry. The data used in this analysis were those updated to 30 June 2001 and referring to all those discharged between 1997 (i.e. when the personal information began to be recorded on the form) and the year 2000 and whose discharge form included a diagnosis of HIV infection (i.e. ICD-9: 279.1; or ICD-10: 042–044). We excluded from the analysis those records which did not include the patient’s name in full (e.g. only the last name and the first initial), the date of birth, or sex, resulting in the exclusion of 30% of the discharge records. Data quality control For the HIV Registry, in addition to identical identification codes, the software used by the registry automatically identifies codes that are very similar (i.e. either codes that share 10 of the 11 characters or codes with the same date of birth and sex and whose four letters indicating the surname and first name are the same yet inverted). The Regional Health Authority then contacts the Local Health Unit that provided the data and requests that the accuracy of the data be checked by reviewing the original patient files. For the AIDS Registry, the procedure is practically identical, with the software automatically performing a control based on patient name. The Hospital-Discharge Registry does not adopt this system of quality control; however, for each record we created an individual identification code (i.e. the same as that used by the HIV Registry) and applied some software that checks for the same potential errors described above. When codes were similar, we checked other data on the record (i.e. name, date of birth, and sex) and decided which records probably referred to the same person. For the Registry of Deaths, it was assumed that the problem of a person being reported more than once did not exist. The control for similar identification codes was again performed once the data from all four registries were linked. The possible sources of input errors were also considered when identifying similar codes, such as the proximity of keys on the computer keyboard or the fact that the letters n, m, and u are very similar when handwritten. Linkage among the registries Data from the four registries were linked using the code of the HIV Registry. The linkage was restricted to people residing in the Veneto Region. The dates of the events (i.e. HIV testing, AIDS diagnosis, and hospital-discharge) were not considered when matching was performed because these events are not necessarily sequential. With regard to the date of death, this was checked only if it followed the dates of the other events. However, this never occurred in our study. All links were performed using the Access program (Microsoft, Seattle, USA). Estimate of the cumulative number of HIV diagnoses To estimate the cumulative number of people diagnosed with HIV infection, we used capture-recapture techniques. In brief, the recapture information (i.e. the source-overlap information or source intersection) is used to estimate the number of people not detected by any of the sources, under certain assumptions (the assumptions made when using these methods have been reviewed by Chao et al.).11 For example, when two data sources are available, only three frequencies can be observed: people identified only by source 1 (indicated by n1); people identified only by source 2 (n2); and people identified by both (n1&2). If we assume that the two sources capture people independently, we can estimate the population size (N) as follows: N = n1 • n2/n1&2 (known as ‘Petersen’s estimator’). However, if the two sources are positively dependent (i.e. if people are captured by one source then they are more likely to be captured by the other source), then the use of Petersen’s estimator results in an underestimate of the true size of the population. Conversely, if the two sources are negatively dependent (i.e. if people are captured by one source they are less likely to be captured by the other source), then the Petersen’s estimator produces an overestimate. When only two sources are available, the assumption of independence is clearly the main weak point of the capture-recapture method. For cases in which more than two sources are available, a variety of models that incorporate dependence have been proposed, of which the most commonly used are the log-linear models and the sample-coverage approach.11 In brief, in the log-linear models, the data are regarded as a form of an incomplete 2t contingency table (with t indicating the number of sources available), and various log-linear models are fitted to the observed cells. The best-fitting model is selected using the deviance statistic and assuming that there is no t-list interaction term (e.g. if four sources are available, then it is assumed that there is no four-source interaction term). In the sample-coverage approach, dependence is modelled by a parameter known as ‘coefficient of variation’, which measures the degree of dependence among two sources.11 The sample coverage is the average of the overlap fractions among sources, which is the average of the fraction of cases found more than once. When independence exists, the population size can be estimated as the ratio of overlapped cases to overlap fraction. When dependence exists, the dependence is taken into account by adjusting the simple estimator with a factor derived from a function of the two-sample coefficient of variations.11 In our study, we used an interactive program developed to apply capture-recapture methods (CARE, available at http://www.stat.nthu.edu.tw/∼chao/) for estimating the cumulative number of people diagnosed with HIV infection using previously described approaches. Regarding the choice of the log-linear model, we selected the lower deviance statistic model and, as a comparison, the saturated model. No model selection is necessary in the sample coverage approach. The 95% CI were calculated using a logarithmic transformation for the log-linear models. For the sample coverage approach, 95% CI were also calculated using a logarithmic transformation after having calculated the standard error of the estimate using 1000 bootstrap replications. To determine whether the estimate was particularly sensitive to the inclusion of a specific registry, we initially used only three of the four registries, obtaining four different estimates. We then obtained estimates using all four registries. Results Table 1 shows the cumulative number of people with HIV infection reported to each of the four registries and the main characteristics of these registries. As shown in the Table, the registries were created at different times. Specifically, the AIDS Registry was created earlier than the other three registries, and it has covered the entire period of the AIDS epidemic in Italy. Furthermore, the data recorded by the Registry of Deaths for the period 1992–1995 and the data recorded by the Hospital-Discharge Registry for 1997–2000 are only partially complete (Methods). Table 2 shows the cumulative number of HIV-infected people found in one, two, three, and four registries. The number of people found in at least one registry was 8723. The greatest percentage of people was found in only one registry (4896 out of 8723; 56.2%); 2387 (27.4%) were found in only two registries; 1276 (14.6%) in only three registries; and only 154 (1.8%) in all four registries. Table 3 shows the estimates of the cumulative number of people diagnosed with HIV infection using the log-linear model and the sample-coverage approach, when cross-linking only three registries and when cross-linking all four registries. The estimates obtained with the log-linear model were consistently similar to those obtained with the sample-coverage approach, although in some cases the deviance statistic (used to determine the goodness-of-fit of the model) was large, implying that the selected models were not adequate. The estimates obtained using three registries were around 11 000–12 500 when including the HIV Registry; when this registry was not included, the estimates were lower (ranging from approximately 7000 to 8000 cases). The estimates obtained when using all four registries were similar to those obtained using three registries that included the HIV Registry (i.e. around 11 000–12 500 cases). Discussion When cross-linking records from four different registries to which cases of HIV infection are reported, we found 8723 HIV-infected people resident in the Veneto Region of Italy. Using capture-recapture methods and linking all four sources, we estimated, using the sample-coverage approach,11 that there were more than 11 200 diagnoses of HIV infection. Similar estimates were found using the log-linear model, although this model did not always fit the data well. This estimate is about 23% higher than the number of HIV diagnoses found through cross-linkage alone. It is about four times the cumulative number of AIDS cases reported to the AIDS Registry and approximately 32.5% higher than the 7563 cases found when cross-linking data from the HIV Registry and the AIDS Registry. Although our estimate of the number of diagnoses is higher than previous estimates of the cumulative number of infections in the Veneto Region (i.e. between 9000 and 10 000 cases, both diagnosed and not), which were obtained with two different back-calculation methods that jointly used data from the HIV and AIDS Registries, this can be expected, in that these latter estimates date back to 1995.16,17 To have an overall idea of the cumulative number of people diagnosed with HIV infection in the entire country of Italy, we considered that the number of AIDS cases reported to the National AIDS Registry for people resident in the Veneto Region accounts for approximately 6% of the cumulative number of AIDS cases reported for the entire country to the same registry (i.e. about 1 out of every 16.7 cases).1 If the same ratio is applied to our estimate of the cumulative number of HIV diagnoses (i.e. multiplying 11 200 by 16.7), then the number of diagnoses in all of Italy would be more than 180 000. This estimate is higher than those previously reported, although all of the previous estimates date back to before 1995.18–21 From a public health point of view, it is important to estimate the number of people currently living with HIV infection. If we subtract the total number of deaths (i.e. International Classification of Diseases, Ninth Revision (ICD-9): 279.1; or ICD-10: 042–044) reported to the Death Registry, including those that could not be linked in the present analysis (i.e. a total of 1966 deaths in the Veneto Region and of around 35 000 deaths in all of Italy), we obtain around 9250 prevalent infections in the Veneto Region and around 145 000 prevalent infections in Italy. However, it bears mentioning that this figure would represent an overestimate if the number of deaths were underestimated because of, for example, HIV-infected people dying from causes not related to HIV infection (e.g. overdose and suicide). We found that the estimate provided by capture-recapture methods using four registries was similar to those obtained using three registries, yet only when the HIV Registry was included. This finding was expected, in that the other three registries usually receive reports of people for whom HIV infection has reached an advanced stage and thus miss those with asymptomatic HIV-infection. Thus, when applying this methodology in other settings, it should be considered that more accurate estimates would be obtained if the registries covered people in various phases of the disease (e.g. asymptomatic and advanced disease). The 6415 diagnoses of HIV infection reported to the HIV Registry only represent 57% of the estimated 11 200 cumulative diagnoses. The most important reason for this low coverage is that the HIV Registry was only created in 1988 and it reported only new HIV diagnoses;13 at that time, the HIV epidemic curve in Italy had probably already peaked and many individuals had probably already been diagnosed. Another reason is related to the fact that the HIV Registry has a high probability of missing people who discover their seropositivity only upon AIDS diagnosis. This because medical personnel often take the attitude that, since HIV-infected people who have already developed AIDS are reported to the AIDS Registry, it is not necessary to notify the HIV Registry.9 In interpreting the findings of this analysis, it must be considered that certain limitations could have arisen because of the assumptions made by capture-recapture methods (i.e. unique and correct identification of the individual, ascertainment assumption, closure assumption, definition of the population).11 First, the code we used for cross-linking is not always unique to one individual. However, a previous evaluation of the entire database of the Italian AIDS Registry, which records the person’s full name, showed that only about 1 out of every 1000 people had the same code as another person (Pezzotti P, personal communication). Furthermore, we found that none of those HIV-infected had the same code when using the full name recorded in the AIDS Registry (Veneto residents only), in the Death Registry, and in the Hospital-Discharge Registry. Although we also attempted to reduce the bias due to errors in the codes, there is no way of guaranteeing that all errors were detected. A second limitation is that the data from the HIV Registry, the Death Registry, and the Hospital-Discharge Registry do not cover the entire period of the epidemic. However, even when the capture-recapture analysis was performed excluding the AIDS Registry, whose data do cover the entire period of the epidemic, the estimates remained very similar to that obtained with four registries. Third, the closure assumption requires that the size of the population be relatively constant during the entire study period, yet the validity of this assumption is difficult to evaluate. Nonetheless, according to previous studies using back-calculation models, only early in the epidemic (i.e. in the mid 1980s) did the number of prevalent infections substantially increase, with no appreciable modifications afterwards.16–21 Thus this issue is probably of limited relevance, in that three of our data sources began collecting data in the late 1980s and the estimates based on only these three sources were similar to those obtained when including data from the AIDS Registry (i.e. the only source that has collected data for the entire epidemic period). In interpreting this latter result, it must be considered that the epidemic commenced later in Italy compared with other industrialized countries. In the Veneto Region in particular, until 1988, there were very few AIDS deaths, and thus we should not have lost a great number of people. Finally, it should be stressed that we estimated the number of diagnoses and not the number of infections, and thus our estimates should be considered as a lower bound of the cumulative number of infections because an unknown proportion of HIV-positive people who are asymptomatic and with a low perception of risk may not have undergone HIV testing. KEY MESSAGES The cumulative number of people diagnosed with human immunodeficiency virus (HIV) infection in the Veneto Region of Italy was estimated by cross-linking data from the AIDS Registry, the HIV Registry, the Death Registry, and the Hospital-Discharge Registry, and by applying capture-recapture methods. The cumulative number of reported AIDS cases greatly underestimates the cumulative number of HIV diagnoses, as does the surveillance of new HIV diagnoses, which was begun far into the epidemic. The combined use of cross-linkage and capture-recapture methods could help to evaluate the coverage of HIV and AIDS surveillance systems and to obtain more accurate prevalence estimates, which are especially important for allocating healthcare resources. In conclusion, the combined use of cross-linkage and capture-recapture methods could help to evaluate the coverage of HIV and AIDS surveillance systems and to obtain more accurate prevalence estimates, which are especially important for allocating healthcare resources. Table 1 Cumulative number of people with human immunodeficiency virus (HIV) infection reported to four different registries in the Veneto Region of Italy Registry . Period . Name-based? . Cumulative no. of cases of HIV infection . Notes on the characteristics of the registries . aFirst and third letters of the surname and of the first name, the last two digits of the year of birth, and either ‘M’ for males or ‘F’ for females (total of 11 characters). AIDS Registry 1983–2000 Yes 2801 – HIV-Registry 1988–2000 No 6415 Unique record numbera Death Registry 1992–1999 Yes 1598 Data from 1992–1995 are only 90% complete Hospital-Discharge Registry 1997–2000 Yes 3330 Data are only 70% complete Registry . Period . Name-based? . Cumulative no. of cases of HIV infection . Notes on the characteristics of the registries . aFirst and third letters of the surname and of the first name, the last two digits of the year of birth, and either ‘M’ for males or ‘F’ for females (total of 11 characters). AIDS Registry 1983–2000 Yes 2801 – HIV-Registry 1988–2000 No 6415 Unique record numbera Death Registry 1992–1999 Yes 1598 Data from 1992–1995 are only 90% complete Hospital-Discharge Registry 1997–2000 Yes 3330 Data are only 70% complete Open in new tab Table 1 Cumulative number of people with human immunodeficiency virus (HIV) infection reported to four different registries in the Veneto Region of Italy Registry . Period . Name-based? . Cumulative no. of cases of HIV infection . Notes on the characteristics of the registries . aFirst and third letters of the surname and of the first name, the last two digits of the year of birth, and either ‘M’ for males or ‘F’ for females (total of 11 characters). AIDS Registry 1983–2000 Yes 2801 – HIV-Registry 1988–2000 No 6415 Unique record numbera Death Registry 1992–1999 Yes 1598 Data from 1992–1995 are only 90% complete Hospital-Discharge Registry 1997–2000 Yes 3330 Data are only 70% complete Registry . Period . Name-based? . Cumulative no. of cases of HIV infection . Notes on the characteristics of the registries . aFirst and third letters of the surname and of the first name, the last two digits of the year of birth, and either ‘M’ for males or ‘F’ for females (total of 11 characters). AIDS Registry 1983–2000 Yes 2801 – HIV-Registry 1988–2000 No 6415 Unique record numbera Death Registry 1992–1999 Yes 1598 Data from 1992–1995 are only 90% complete Hospital-Discharge Registry 1997–2000 Yes 3330 Data are only 70% complete Open in new tab Table 2 Cumulative number of people with human immunodeficiency virus (HIV) infection present in one, two, three, and four registries AIDS Registry . HIV Registry . Death Registry . Hospital-Discharge Registry . Frequency . Yes No No No 409 No Yes No No 3341 No No Yes No 47 No No No Yes 1099 Yes Yes No No 342 Yes No Yes No 400 Yes No No Yes 255 No Yes Yes No 135 No Yes No Yes 1241 No No Yes Yes 14 Yes Yes Yes No 719 Yes Yes No Yes 438 Yes No Yes Yes 84 No Yes Yes Yes 45 Yes Yes Yes Yes 154 Total present in at least one registry 8723 AIDS Registry . HIV Registry . Death Registry . Hospital-Discharge Registry . Frequency . Yes No No No 409 No Yes No No 3341 No No Yes No 47 No No No Yes 1099 Yes Yes No No 342 Yes No Yes No 400 Yes No No Yes 255 No Yes Yes No 135 No Yes No Yes 1241 No No Yes Yes 14 Yes Yes Yes No 719 Yes Yes No Yes 438 Yes No Yes Yes 84 No Yes Yes Yes 45 Yes Yes Yes Yes 154 Total present in at least one registry 8723 Open in new tab Table 2 Cumulative number of people with human immunodeficiency virus (HIV) infection present in one, two, three, and four registries AIDS Registry . HIV Registry . Death Registry . Hospital-Discharge Registry . Frequency . Yes No No No 409 No Yes No No 3341 No No Yes No 47 No No No Yes 1099 Yes Yes No No 342 Yes No Yes No 400 Yes No No Yes 255 No Yes Yes No 135 No Yes No Yes 1241 No No Yes Yes 14 Yes Yes Yes No 719 Yes Yes No Yes 438 Yes No Yes Yes 84 No Yes Yes Yes 45 Yes Yes Yes Yes 154 Total present in at least one registry 8723 AIDS Registry . HIV Registry . Death Registry . Hospital-Discharge Registry . Frequency . Yes No No No 409 No Yes No No 3341 No No Yes No 47 No No No Yes 1099 Yes Yes No No 342 Yes No Yes No 400 Yes No No Yes 255 No Yes Yes No 135 No Yes No Yes 1241 No No Yes Yes 14 Yes Yes Yes No 719 Yes Yes No Yes 438 Yes No Yes Yes 84 No Yes Yes Yes 45 Yes Yes Yes Yes 154 Total present in at least one registry 8723 Open in new tab Table 3 Estimates of the population size, with 95% CI in parentheses, based on incomplete sources by capture-recapture methods. The models considered include the log-linear models and the sample coverage approach . Log-linear model . . . . Best fitting model . Saturated model . Sample coverage approach . Registries used in the capture–recapture estimate . Model . Deviance . d.f. . Estimate . 95% CI . Estimate . 95% CI . Estimate . 95% CI . aSelected best fitting model included two different interactions terms [HIV and Death registry (HD); AIDS and Death registry (AD)]. bSelected best fitting model included one common interaction term [reported as part-qs1 in 11 (AIDS and HIV registry; HIV and Death registry)]. cSelected best fitting model included two different interactions terms [AIDS and Death registry (AD); Death and Hospital registry (DR)]. dSelected best fitting model included two different interactions terms [HIV and Death registry (HD); Death and Hospital registry (DR)]. eSelected best fitting model included one common 2-factor interaction term (reported as H1 in ref. 11); one common 3-factor interaction term (H2); and three different interaction terms [AIDS and Death registry (AD); HIV and Hospital registry (HR); Death and Hospital registry (DR)]. *Deviance shows that the selected best fitting model does not fit well the data. AIDS registry (A) HIV registry (H) HD/ADa 10.18* 1 11 525 (11 104, 11 996) 10 008 (9338, 10 940) 11 374 (10 826, 12 017) Death registry (D) AIDS registry (A) HIV registry (H) part-qs1b 0.0 1 12 677 (12 043, 13 432) 12 682 (11 999, 13 506) 12 512 (11 949, 13 173) Hospital registry (R) AIDS registry (A) Death registry (D) AD/DRc 6.45* 1 7918 (7630, 8242) 7046 (6560, 7732) 7186 (6837, 7619) Hospital registry (R) HIV registry (H) Death registry (D) HD/DRd 0.2 1 11 284 (11 031, 11 560) 11 471 (10 699, 12 493) 11 186 (10 972, 11 416) Hospital registry (R) AIDS registry (A) HIV registry (H) H1 H2 Death registry (D) AD/HR/DRe 51.41* 5 12 981 (12 418, 13 629) 15 391 (11 867, 22 864) 11 281 (10 981–11 621) Hospital registry (R) . Log-linear model . . . . Best fitting model . Saturated model . Sample coverage approach . Registries used in the capture–recapture estimate . Model . Deviance . d.f. . Estimate . 95% CI . Estimate . 95% CI . Estimate . 95% CI . aSelected best fitting model included two different interactions terms [HIV and Death registry (HD); AIDS and Death registry (AD)]. bSelected best fitting model included one common interaction term [reported as part-qs1 in 11 (AIDS and HIV registry; HIV and Death registry)]. cSelected best fitting model included two different interactions terms [AIDS and Death registry (AD); Death and Hospital registry (DR)]. dSelected best fitting model included two different interactions terms [HIV and Death registry (HD); Death and Hospital registry (DR)]. eSelected best fitting model included one common 2-factor interaction term (reported as H1 in ref. 11); one common 3-factor interaction term (H2); and three different interaction terms [AIDS and Death registry (AD); HIV and Hospital registry (HR); Death and Hospital registry (DR)]. *Deviance shows that the selected best fitting model does not fit well the data. AIDS registry (A) HIV registry (H) HD/ADa 10.18* 1 11 525 (11 104, 11 996) 10 008 (9338, 10 940) 11 374 (10 826, 12 017) Death registry (D) AIDS registry (A) HIV registry (H) part-qs1b 0.0 1 12 677 (12 043, 13 432) 12 682 (11 999, 13 506) 12 512 (11 949, 13 173) Hospital registry (R) AIDS registry (A) Death registry (D) AD/DRc 6.45* 1 7918 (7630, 8242) 7046 (6560, 7732) 7186 (6837, 7619) Hospital registry (R) HIV registry (H) Death registry (D) HD/DRd 0.2 1 11 284 (11 031, 11 560) 11 471 (10 699, 12 493) 11 186 (10 972, 11 416) Hospital registry (R) AIDS registry (A) HIV registry (H) H1 H2 Death registry (D) AD/HR/DRe 51.41* 5 12 981 (12 418, 13 629) 15 391 (11 867, 22 864) 11 281 (10 981–11 621) Hospital registry (R) Open in new tab Table 3 Estimates of the population size, with 95% CI in parentheses, based on incomplete sources by capture-recapture methods. The models considered include the log-linear models and the sample coverage approach . Log-linear model . . . . Best fitting model . Saturated model . Sample coverage approach . Registries used in the capture–recapture estimate . Model . Deviance . d.f. . Estimate . 95% CI . Estimate . 95% CI . Estimate . 95% CI . aSelected best fitting model included two different interactions terms [HIV and Death registry (HD); AIDS and Death registry (AD)]. bSelected best fitting model included one common interaction term [reported as part-qs1 in 11 (AIDS and HIV registry; HIV and Death registry)]. cSelected best fitting model included two different interactions terms [AIDS and Death registry (AD); Death and Hospital registry (DR)]. dSelected best fitting model included two different interactions terms [HIV and Death registry (HD); Death and Hospital registry (DR)]. eSelected best fitting model included one common 2-factor interaction term (reported as H1 in ref. 11); one common 3-factor interaction term (H2); and three different interaction terms [AIDS and Death registry (AD); HIV and Hospital registry (HR); Death and Hospital registry (DR)]. *Deviance shows that the selected best fitting model does not fit well the data. AIDS registry (A) HIV registry (H) HD/ADa 10.18* 1 11 525 (11 104, 11 996) 10 008 (9338, 10 940) 11 374 (10 826, 12 017) Death registry (D) AIDS registry (A) HIV registry (H) part-qs1b 0.0 1 12 677 (12 043, 13 432) 12 682 (11 999, 13 506) 12 512 (11 949, 13 173) Hospital registry (R) AIDS registry (A) Death registry (D) AD/DRc 6.45* 1 7918 (7630, 8242) 7046 (6560, 7732) 7186 (6837, 7619) Hospital registry (R) HIV registry (H) Death registry (D) HD/DRd 0.2 1 11 284 (11 031, 11 560) 11 471 (10 699, 12 493) 11 186 (10 972, 11 416) Hospital registry (R) AIDS registry (A) HIV registry (H) H1 H2 Death registry (D) AD/HR/DRe 51.41* 5 12 981 (12 418, 13 629) 15 391 (11 867, 22 864) 11 281 (10 981–11 621) Hospital registry (R) . Log-linear model . . . . Best fitting model . Saturated model . Sample coverage approach . Registries used in the capture–recapture estimate . Model . Deviance . d.f. . Estimate . 95% CI . Estimate . 95% CI . Estimate . 95% CI . aSelected best fitting model included two different interactions terms [HIV and Death registry (HD); AIDS and Death registry (AD)]. bSelected best fitting model included one common interaction term [reported as part-qs1 in 11 (AIDS and HIV registry; HIV and Death registry)]. cSelected best fitting model included two different interactions terms [AIDS and Death registry (AD); Death and Hospital registry (DR)]. dSelected best fitting model included two different interactions terms [HIV and Death registry (HD); Death and Hospital registry (DR)]. eSelected best fitting model included one common 2-factor interaction term (reported as H1 in ref. 11); one common 3-factor interaction term (H2); and three different interaction terms [AIDS and Death registry (AD); HIV and Hospital registry (HR); Death and Hospital registry (DR)]. *Deviance shows that the selected best fitting model does not fit well the data. AIDS registry (A) HIV registry (H) HD/ADa 10.18* 1 11 525 (11 104, 11 996) 10 008 (9338, 10 940) 11 374 (10 826, 12 017) Death registry (D) AIDS registry (A) HIV registry (H) part-qs1b 0.0 1 12 677 (12 043, 13 432) 12 682 (11 999, 13 506) 12 512 (11 949, 13 173) Hospital registry (R) AIDS registry (A) Death registry (D) AD/DRc 6.45* 1 7918 (7630, 8242) 7046 (6560, 7732) 7186 (6837, 7619) Hospital registry (R) HIV registry (H) Death registry (D) HD/DRd 0.2 1 11 284 (11 031, 11 560) 11 471 (10 699, 12 493) 11 186 (10 972, 11 416) Hospital registry (R) AIDS registry (A) HIV registry (H) H1 H2 Death registry (D) AD/HR/DRe 51.41* 5 12 981 (12 418, 13 629) 15 391 (11 867, 22 864) 11 281 (10 981–11 621) Hospital registry (R) Open in new tab The study was supported by a grant from ‘Regione del Veneto’ provided to the Istituto Superiore di Sanità. The authors wish to thank Mark Kanieff for his help in revising the manuscript. References 1 Centro Operativo AIDS. Aggiornamento dei casi di AIDS al 30 Giugno 2001. Notiziario ISS 2001 ; 14 (Suppl.). 2 European Centre for the Epidemiological Monitoring of AIDS. HIV/AIDS Surveillance in Europe. End Year Report 2001. Saint Maurice: Institut de Veille Sanitaire, 2002. No. 66. 3 World Health Organization. Global AIDS surveillance. Weekly Epidemiological Record 2001 ; 76 : 390 –96. 4 Centers for Disease Control. US HIV and AIDS cases reported through June 2001. HIV/AIDS Surveillance Report, Midyear edition, Vol. 13, No. 1. 5 The Cascade Collaboration. Survival after introduction of HAART in people with known duration of HIV-1 infection. Lancet 2000 ; 355 : 1158 –59. 6 Gostin LO, Ward JW, Baker AC. National HIV case reporting for the United States. A defining moment in the history of the epidemic. N Engl J Med 1997 ; 337 : 1162 –67. 7 Community Disease Surveillance Centre. HIV and AIDS in UK. An Epidemiological Review: 2000. (http://www.phls.co.uk/topics_az/hiv_and_sti/publications/hiv_annual_2000.pdf) 8 Community Disease Surveillance Centre. Estimating the Proportion of Identified HIV Infection which are Reported to the PHLS Aids Unit at CDSC. Unpublished internal report, 1994. 9 Michieletto F, Piovesan C, Gallo G. Sensitivity and representativeness of an HIV surveillance system in the Veneto region. Bollettino Epidemiologico Nazionale—Notiziario ISS 2001 ; 14 . (http://www.ben.iss.it/precedenti/dicembre/2eng.htm) 10 Aavitsland P, Nilsen Ø, Lystad A. Anonymous reporting of HIV infection: an evaluation of the HIV/AIDS surveillance system in Norway 1983–2000. Eur J Epidemiol 2001 ; 17 : 479 –89. 11 Chao A, Tsay PK, Lin SH, Shau WY, Chao DY. The applications of capture-recapture models to epidemiological data. Stat Med 2001 ; 20 : 3123 –57. 12 Ancelle-Park RA, Alix J, Downs AM, Brunet JB. Impact of 1993 revision of adult/adolescent AIDS surveillance case-definition for Europe. National Coordinators for AIDS Surveillance in 38 European countries. Lancet 1995 ; 345 : 789 –90. 13 Pezzotti P, Piovesan C, Rezza G, Moro G, Serpelloni G, Ferraro P. Combining HIV and AIDS surveillance: an experience from an Italian region. Int J Epidemiol 1997 ; 26 : 1352 –58. 14 Istituto Nazionale di Statistica (ISTAT). Annuari n.14 (2001), Cause di Morte Anno 1998. Roma: ISTAT. 15 Ministero della Sanità. Istituzione della Scheda di Dimissione Ospedaliera. Decreto Ministeriale 28 Dicembre 1991. 16 Glad I, Frigessi A, Scalia Tomba G, Balducci M, Pezzotti P. Bayesian Back-calculation with HIV Seropositivity Notifications. Statistical research report, Department of Mathematics, University of Oslo, Oslo, 1998. 17 Bellocco R, Marschner IC. Joint analysis of HIV and AIDS surveillance data in back-calculation. Stat Med 2000 ; 19 : 297 –311. 18 Verdecchia A, Mariotto A, Capocaccia R, Mariotti S. An age and period reconstruction of the HIV epidemic in Italy. Int J Epidemiol 1994 ; 23 : 1027 –39. 19 Mariotti S, Cascioli R. Sources of uncertainty in estimating HIV infection rates by back-calculation: an application to Italian data. Stat Med 1996 ; 15 : 2669 –87. 20 Downs AM, Heisterkamp SH, Rava L, Houweling H, Jager JC, Hamers FF. Back-calculation by birth cohort, incorporating age-specific disease progression, pre-AIDS mortality and change in European AIDS case definition. European Union Concerted Action on Multinational AIDS Scenarios. AIDS 2000 ; 14 : 2179 –89. 21 Bellocco R, Pagano M. Multinomial analysis of smoothed HIV back-calculation models incorporating uncertainty in the AIDS incidence. Stat Med 2001 ; 20 : 2017 –33. © International Epidemiological Association 2003 TI - Estimating the cumulative number of human immunodeficiency virus diagnoses by cross-linking from four different sources JF - International Journal of Epidemiology DO - 10.1093/ije/dyg202 DA - 2003-10-01 UR - https://www.deepdyve.com/lp/oxford-university-press/estimating-the-cumulative-number-of-human-immunodeficiency-virus-AsbsIPlLA3 SP - 778 EP - 783 VL - 32 IS - 5 DP - DeepDyve ER -