Automated Cervical Screening and Triage, Based on HPV Testing and Computer-Interpreted Cytology

Automated Cervical Screening and Triage, Based on HPV Testing and Computer-Interpreted Cytology Abstract Background State-of-the-art cervical cancer prevention includes human papillomavirus (HPV) vaccination among adolescents and screening/treatment of cervical precancer (CIN3/AIS and, less strictly, CIN2) among adults. HPV testing provides sensitive detection of precancer but, to reduce overtreatment, secondary “triage” is needed to predict women at highest risk. Those with the highest-risk HPV types or abnormal cytology are commonly referred to colposcopy; however, expert cytology services are critically lacking in many regions. Methods To permit completely automatable cervical screening/triage, we designed and validated a novel triage method, a cytologic risk score algorithm based on computer-scanned liquid-based slide features (FocalPoint, BD, Burlington, NC). We compared it with abnormal cytology in predicting precancer among 1839 women testing HPV positive (HC2, Qiagen, Germantown, MD) in 2010 at Kaiser Permanente Northern California (KPNC). Precancer outcomes were ascertained by record linkage. As additional validation, we compared the algorithm prospectively with cytology results among 243 807 women screened at KPNC (2016–2017). All statistical tests were two-sided. Results Among HPV-positive women, the algorithm matched the triage performance of abnormal cytology. Combined with HPV16/18/45 typing (Onclarity, BD, Sparks, MD), the automatable strategy referred 91.7% of HPV-positive CIN3/AIS cases to immediate colposcopy while deferring 38.4% of all HPV-positive women to one-year retesting (compared with 89.1% and 37.4%, respectively, for typing and cytology triage). In the 2016–2017 validation, the predicted risk scores strongly correlated with cytology (P < .001). Conclusions High-quality cervical screening and triage performance is achievable using this completely automated approach. Automated technology could permit extension of high-quality cervical screening/triage coverage to currently underserved regions. Cervical cancer is caused by persistent infection with carcinogenic human papillomaviruses (HPV) (1). Prophylactic HPV vaccination to control transmission is the ultimate prevention strategy; however, vaccination remains a long-term solution due to still-limited coverage and the long latency between infection and cancer development (2). Improving and expanding cervical screening with a focus on increasingly vaccinated cohorts and lower-resource settings will remain a global prevention priority for decades; otherwise, the global burden of cervical cancer will continue to rise (3). It might be possible to expand the reach of screening programs using automation to overcome the chronic shortage of skilled laboratory professionals (4). The main goal of cervical screening is detection of high-grade “precancers” (defined by histopathology of CIN3 and AIS, or less stringently for maximal protection as CIN2/CIN3/AIS), which can be treated to prevent cervical cancer mortality and morbidity. Cervical screening programs include two distinct procedures: screening of the general population and triage of screen-positive women to focus treatment on precancers. For the general screening phase, HPV testing for the carcinogenic types of HPV is gradually replacing cytology (Pap testing), because HPV testing is more sensitive for detection of precancer with more sustained negative predictive value (5). It is also more reproducible and adaptable to increasingly HPV-vaccinated populations. However, a pressing and unsolved problem is how best to identify the small minority of 5% to 10% of women (or more) testing positive with HPV screening who are most likely to have precancer requiring treatments. Most women control/clear their infections and do not need treatment, which involves ablation or excision of the ring of cervical tissue especially susceptible to HPV-induced carcinogenesis (1). It is partly the cost and complexity of triage testing needed to avoid overtreatment that is impeding widespread adoption of HPV screening programs (6). The leading triage strategy in higher-resource regions combines partial HPV typing (to identify the highest-risk types) and cytology (used in this case as a second test among HPV-positive women) (7–10). Conventional cytology, which is currently combined with HPV testing in most US screening programs to determine which women should go to colposcopy, is labor-intensive. Use of computer-interpreted cytology for triage would permit the automation of the whole screening process. Here we report the design and evaluation of such a fully automatable cervical screening strategy to determine whether an automated triage algorithm could triage HPV-positive women as accurately as conventionally interpreted liquid-based SurePath cytology (BD, Burlington, NC). Methods Study Design and Participants The study was conducted in two parts, both using data from the Kaiser Permanente Northern California (KPNC) cervical screening program: 1) development and validation of the risk score algorithm predicting precancer in a cohort analysis of women cotested with HPV and computer-scanned liquid-based cytology in 2010 and 2) validation of the association between higher risk scores and HPV-positive, abnormal cytology cotest results, in a large cohort of women screened in 2016–2017. We developed and initially validated the computer algorithm based on stored, residual cervical specimens and liquid-based cytology slides from the Persistence and Progression (PaP) study conducted within the KPNC cervical screening program, which has used cytology-HPV cotesting since 2003 (11,12). The NCI-KPNC PaP study was formed by collection, neutralization, and freezing of discarded aliquots of routinely collected cervical specimens in STM buffer tested for HPV by HC2 (Qiagen, Germantown, MD). The baseline collection included 54 723 women with cotesting in 2007–2011. When women in the PaP cohort returned for later cotest visits, additional discard specimens were collected and stored. Clinical outcomes of infections were ascertained by merging data from the KPNC cytopathology, histopathology, and HPV testing files, complete at the time of analysis through 2015. To develop the risk score algorithm, we used HPV test specimens and the corresponding stored cotest liquid-based cytology slides (SurePath, BD Diagnostics, Sparks, MD) from the subgroup of women in the PaP cohort with an HC2-positive baseline or follow-up visit in 2010. The final analytic set (called the 2010 cohort) consisted of 1839 subjects. We considered separately the subset of 20 cases with histopathologic outcomes of invasive cancer, because we were mainly studying screening to find treatable precancer. The flowchart for obtaining the final set is given in Supplementary Figure 1 (available online). To further validate the risk score algorithm, we used a set of 243 807 scanned SurePath slides from routine cervical screening at KPNC in late 2016 to mid-2017 (called the 2016–2017 cohort). All specimen use in this study has been approved by both the KPNC and National Cancer Institute Institutional Review Boards. Use of masked discard specimens was judged not to require written informed consent; women could opt out of use of their specimens (∼8% did). Partial HPV Typing We conducted partial HPV typing of the HC2-positive women from 2010 stored aliquots using Onclarity (Becton Dickinson Diagnostics, Sparks, MD), a nine-channel DNA amplification and detection assay recently approved by the US Food and Drug Administration for HPV testing and separate identification of HPV16, HPV18, and HPV45 (13). Here we combined the nine HPV type channel results into four HPV groups according to the earlier work (11). The pre-established, hierarchical order from highest to low risk of precancer/cancer was HPV16, else HPV31/33/52/58, else HPV18/45, else HPV35/39/51/56/59/66/68. Histopathology Outcomes In the analysis of the 2010 cohort, we first ascertained the worst histologic diagnosis made by KPNC pathologists during follow-up from the time of the screening cotest to the end of 2015 (median = 3.7 years, interquartile range = 2.3–4.6 years). We chose CIN2/CIN3/AIS as the measurement of precancer for algorithm training, because it represents the treatment threshold in this age group at KPNC. When evaluating the performance of the established algorithm, we also considered CIN3/AIS as the more stringent histopathologic diagnoses better representing true precancer (14), although the numbers of cases were more limited. For the absolute risk estimate, we focused on three-year risk of precancer to include an entire screening cycle. Statistical Analysis We used the likelihood ratio test based on the linear regression model to compare risk score distributions among different groups, the likelihood ratio test based on the logistic regression model to evaluate the association between a risk factor and the case–control status, and the McNemar’s test to compare sensitivity and specificity between two given diagnosis rules. All statistical tests were two-sided, and a P value of less than .05 was considered statistically significant. Development of the Risk Score Algorithm We developed a novel risk score algorithm (called “the algorithm”) to triage HPV-positive women using a redesign of BD FocalPoint cytology to target precancer. FocalPoint is a slide scanner that performs a high-speed image capture and outputs 160 binary or quantitative “features” of the cytology slide such as presence of different cell types, nuclear size, and nuclear contour (15). The severity rank generated by the commercial algorithm is currently designed to identify the most innocuous slides in a batch, to reduce and/or guide cytotechnologist workload. Details of the new algorithm that repurposed FocalPoint optical scanning features to target precancer instead are outlined in the Supplementary Methods (available online). Briefly, based on the 2010 cohort of 1839 subjects, we adopted the least absolute shrinkage and selection operation (LASSO) model (16) implemented in the R package glmnet (17) to create the risk score predicting the likelihood of the presence or imminent development of precancer (CIN2/CIN3/AIS) among HPV-positive women. To have an unbiased evaluation of the risk score using the same training set, we employed the leave one out cross-validation (LOOCV) procedure (18) to produce an unbiased estimated risk score for each subject for subsequent analyses of the 2010 cohort. Comparing the Algorithm With Cytology Cytology can divide HPV-positive women into one of three categories to reflect the underlying risk for precancer: using Bethesda System terminology, they are normal cytology (NILM), minor HPV-related abnormalities (ASC-US/LSIL), or higher-risk results (>LSIL, ie, AGC/ASC-H/HSIL/AIS/cancer). Most HPV-based screening programs refer to colposcopy all HPV-positive women whose reflex cytology is not normal (ie, ASC-US or worse). To make a direct comparison between the performances of the algorithm and cytology in stratifying risk of precancer among HPV-positive women, we chose two cut-points in the automated risk score to produce three risk groups (called high, intermediate, and low risk) of exactly same sizes as the ordinary cytology strategy of >LSIL, ASC-US/LSIL, NILM. We used the R package pROC (19) to calculate receiver operating characteristic (ROC) curves to show the sensitivity/specificity trade-offs for the risk score. To compare the algorithm with cytology, we conducted the McNemar’s test to compare sensitivities for the detection of precancer achieved by the two strategies at the ASC-US or worse threshold (intermediate- or high-risk score), or >LSIL threshold (high-risk score). Assessing the Algorithm and Partial HPV Typing as a Triage Strategy We compared absolute risks for precancer in the 12 groups of HPV-positive women formed by the combination of four HPV type groups and three cytology or corresponding risk score groups. Using a novel logistic Cox model (20), we estimated three-year cumulative risks (including the risk for precancers present at the time of screening) of CIN3/AIS or CIN2/CIN3/AIS for the 12 strata (Supplementary Methods, available online). We chose this model because of interval censoring and existence of prevalent events. To extend the analysis to the whole KPNC screening population, we also estimated precancer risk for HPV-negative women based on women age 30 years and older from KPNC who tested negative by HC2 in 2012 (n = 239 948). In current practice in several countries, HPV-positive women are managed according to a combination of partial HPV typing, when available, and cytology (7–10). Women with the highest-risk HPV types (HPV16, HPV18, and sometimes HPV45) and/or abnormal cytology are referred to colposcopy; the remaining HPV-positive women are retested (at one year in the United States). For such a strategy, we compared the triage procedures (ASC-US or worse cytology vs intermediate-risk score threshold), estimating the percentage of precancers occurring within the subsequent three years that they each would have referred. The P value for this comparison was evaluated through a standard bootstrap procedure. Further Validation in the 2016–2017 Cohort We applied the established algorithm, trained with the 2010 cohort of 1839 subjects, to FocalPoint output obtained from 243 807 slides performed during routine cervical screening at KPNC in late 2016 to mid-2017. We validated the algorithm results against HPV/cytology cotest results obtained from those same screening visits. We used the Spearman’s rank correlation coefficient test for this validation. Results Analysis of 2010 Cohort The 2010 cohort consisted of 1839 HPV-positive women, including the 1529 <CIN2 controls (including women with no referral to colposcopy, or colposcopy/biopsy diagnosis of <CIN2) and 310 cases (181 CIN2 and 129 CIN3/AIS). More detailed baseline data are given in Supplementary Table 1 (available online). Based on the entire 2010 cohort, the final LASSO model for the risk score selected 55 features (Supplementary Table 2, available online). The distributions of the risk score (unbiased estimate by the LOOCV procedure), stratified by cytology and histopathology, are summarized in Supplementary Figure 2 (available online). The cytology results of NILM, ASC-US/LSIL, and >LSIL represented 49%, 45%, and 5.9% of samples in the 2010 cohort. By design, the same proportions of subjects were divided into three risk groups (low, intermediate, or high) according to their risk scores. Table 1 shows the number of control and case subjects falling into each group for each strategy, individually and jointly. The three risk groups defined by each strategy were strongly associated with case–control status (P < .001). Although the algorithm and cytology showed good agreement, each generated further risk stratification. For example, after adjusting for the standard cytology strategy, the risk score–based risk groups were still strongly associated with case–control status (P < .001). Table 1. Concordance between the risk groups based on the cytology result and the risk score, in the 2010 cohort of 1839 HPV-positive women, by case–control status* Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) * The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the 2010 cohort (904, 826, 109). ASC-US = atypical squamous cells of undetermined significance; CIN2 = cervical intraepithelial neoplasia grade II; HPV = human papillomavirus; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. † Number of ≥CIN2 cases (number of <CIN2 controls) occurring within three years of screening tests. Table 1. Concordance between the risk groups based on the cytology result and the risk score, in the 2010 cohort of 1839 HPV-positive women, by case–control status* Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) * The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the 2010 cohort (904, 826, 109). ASC-US = atypical squamous cells of undetermined significance; CIN2 = cervical intraepithelial neoplasia grade II; HPV = human papillomavirus; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. † Number of ≥CIN2 cases (number of <CIN2 controls) occurring within three years of screening tests. Figure 1A shows the ROC curve based on the risk score for diagnosis of CIN2/CIN3/AIS, with an area under the curve (AUC) of 0.71 (95% confidence interval [CI] = 0.68 to 0.74). Also shown on the figure are the comparisons of sensitivity/specificity at thresholds corresponding to ≥ASC-US and >LSIL. The risk score corresponding to ≥ASC-US (ie, low- vs intermediate/high-risk groups) had a sensitivity of 0.75 (95% CI = 0.70 to 0.80) and specificity of 0.54 (95% CI = 0.51 to 0.57), both of which were not statistically significantly higher than their counterparts based on the standard cytology strategy, which had a sensitivity of 0.71 (95% CI = 0.66 to 0.76) and a specificity of 0.53 (95% CI = 0.50 to 0.56, P = .16 and .59 for sensitivity and specificity comparisons, respectively). When using the threshold corresponding to >LSIL (meant to define a small highest-risk group), the algorithm had statistically nonsignificantly lower sensitivity/specificity than cytology (P = .08 and .11 for sensitivity and specificity comparisons, respectively). Figure 1. View largeDownload slide The automated risk score receiver operating characteristic (ROC) curve for the detection of cases among human papillomavirus (HPV)–positive women and sensitivity/specificity comparisons between risk groups derived from the automated risk score and conventional cytology result. A) For the detection of cervical intraepithelial neoplasia grade II (CIN2)/cervical intraepithelial neoplasia grade III (CIN3)/adenocarcinoma in situ (AIS). B) For the detection of CIN3/AIS. Results were based on the 2010 cohort, with risk score unbiasedly estimated by the leave one out cross-validation (LOOCV) procedure. The ROC curve plots the trade-off of sensitivity and specificity for increasingly elevated risk scores. Both areas under the curve reflected good overall discrimination between 310 CIN2/CIN3/AIS cases (or 190 CIN3/AIS cases) and 1529 controls (<CIN2), with better area under the curve (AUC) when cases were defined more stringently as CIN3/AIS (excluding CIN2). Direct comparison of the conventional cytology strategy with the algorithm strategy was achieved by first dividing risk according to cytology into three groups (negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance/low-grade squamous intraepithelial lesion [LSIL], and >LSIL, see the Methods for details). The risk scores were divided at two cut-points to create three risk groups of the same sizes as the cytology. The sensitivity and specificity of the two strategies at those two cut-points were compared. AUC = area under the curve; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Figure 1. View largeDownload slide The automated risk score receiver operating characteristic (ROC) curve for the detection of cases among human papillomavirus (HPV)–positive women and sensitivity/specificity comparisons between risk groups derived from the automated risk score and conventional cytology result. A) For the detection of cervical intraepithelial neoplasia grade II (CIN2)/cervical intraepithelial neoplasia grade III (CIN3)/adenocarcinoma in situ (AIS). B) For the detection of CIN3/AIS. Results were based on the 2010 cohort, with risk score unbiasedly estimated by the leave one out cross-validation (LOOCV) procedure. The ROC curve plots the trade-off of sensitivity and specificity for increasingly elevated risk scores. Both areas under the curve reflected good overall discrimination between 310 CIN2/CIN3/AIS cases (or 190 CIN3/AIS cases) and 1529 controls (<CIN2), with better area under the curve (AUC) when cases were defined more stringently as CIN3/AIS (excluding CIN2). Direct comparison of the conventional cytology strategy with the algorithm strategy was achieved by first dividing risk according to cytology into three groups (negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance/low-grade squamous intraepithelial lesion [LSIL], and >LSIL, see the Methods for details). The risk scores were divided at two cut-points to create three risk groups of the same sizes as the cytology. The sensitivity and specificity of the two strategies at those two cut-points were compared. AUC = area under the curve; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. When cases were restricted to CIN3/AIS (n = 129), with controls still being less than CIN2, the AUC of the risk score was basically unchanged, increasing slightly from 0.71 to 0.74 (95% CI = 0.69 to 0.78) (Figure 1B). The algorithm showed a marginal sensitivity advantage over the corresponding cytology threshold >ASC-US (P = .09). We illustrated how the automated screening and triage strategy, which combined HPV testing, partial HPV typing, and the algorithm might work in real practice within the total screened population including HPV-negative as well as HPV-positive women, for the more important diagnosis of CIN3/AIS (Table 2) and, secondarily, for CIN2/CIN3/AIS (Supplementary Table 3, available online). Details on how those calculations were done are given in Supplementary Methods (available online). For general screening, approximately 93% of women would fall into the HPV-negative group (as estimated by HC2 negativity in the full KPNC cohort tested in 2010). Although HPV-positive women are the focus of this analysis, it is noteworthy that a single negative HPV test would predict a very low absolute risk of precancer in the subsequent years (21), but within this large group, 9.9% of CIN3/AIS would be missed. Table 2. Estimated three-year cumulative risk of CIN3/AIS given the HPV typing group and the risk score group/cytology group* HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening * The risk score was estimated by a leave-one-out cross-validated LASSO model that integrated FocalPoint scanned features. The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the analytic set (49%, 45%, and 6%). The risk was estimated by the logistic Cox model. See the Supplementary Methods (available online) for more details on how estimates in the table were obtained. AIS = adenocarcinoma in situ; ASC-US = atypical squamous cells of undetermined significance; CIN3 = cervical intraepithelial neoplasia grade III; HPV = human papillomavirus; LASSO = least absolute shrinkage and selection operator; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Table 2. Estimated three-year cumulative risk of CIN3/AIS given the HPV typing group and the risk score group/cytology group* HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening * The risk score was estimated by a leave-one-out cross-validated LASSO model that integrated FocalPoint scanned features. The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the analytic set (49%, 45%, and 6%). The risk was estimated by the logistic Cox model. See the Supplementary Methods (available online) for more details on how estimates in the table were obtained. AIS = adenocarcinoma in situ; ASC-US = atypical squamous cells of undetermined significance; CIN3 = cervical intraepithelial neoplasia grade III; HPV = human papillomavirus; LASSO = least absolute shrinkage and selection operator; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. In triage of the 7% of women testing HPV-positive, the automated and conventional strategies would both refer to colposcopy those women with HPV16, HPV18, or HPV45. As the major point of comparison, those with abnormal cytology or intermediate/high-risk score also would be referred to colposcopy, while women with cytologic result of NILM (or low-risk score group) and without HPV16, HPV18, or HPV45 would be retested at one year. In the whole screened population, the conventional strategy would refer 4.4% of screened women for immediate colposcopy; included in this referred group would be 80.2% of women with prevalent or incident (within three years) CIN3/AIS. The automated strategy would refer a similar proportion of the total population (4.3%) and catch a slightly higher percentage of CIN3/AIS (82.6%, bootstrap estimated P = .54). Focusing on HPV-positive women, the automated strategy would defer 38.4% of them to one-year retesting, while referring the remaining for colposcopy, catching 91.7% of HPV-positive CIN3/AIS cases (compared with 37.4% and 89.1%, respectively, for the conventional strategy). The two strategies also had a comparable performance for the triage of CIN2/CIN3/AIS (Supplementary Table 3, available online). Characteristics on those 20 cases of invasive cancer that were excluded from the 2010 cohort are given in the Supplementary Methods and Supplementary Table 4 (available online). 2016–2017 Cohort Analysis Using the risk prediction model established with the 2010 cohort, we obtained the risk score on the 243 807 women in the 2016–2017 validation cohort. Figure 2 shows the distributions of risk scores by cytology result among 22 732 HPV-positive women; the association of elevated algorithm scores and severe cytology result was strong and highly statistically significant (P < .001, based on Spearman’s rank correlation test). The scores tended to be low for all HPV-negative slides except for the very rare HPV-negative ASC-H/AGC/HSIL (Supplementary Figure 3, available online). Figure 2. View largeDownload slide Distribution of risk scores by cytology results among human papillomavirus (HPV)–positive women from the 2016–2017 validation cohort. Summaries were based on risk scores of HPV-positive women from the 2016–2017 cohort. The risk score was predicted by the risk score model trained with the 2010 cohort. The average scores were equally low among HPV-infected women with negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance, or low-grade squamous intraepithelial lesion. Scores tended to increase when the cytologic result indicated atypical squamous cells rule out high-grade, or atypical glandular cells. Elevated scores were observed for women with high-grade squamous intraepithelial lesion (HSIL)/adenocarcinoma in situ (AIS; with relatively very few AIS). Two results of cancer were excluded; both had very high scores. HPV-negative women had uniformly low scores, except for extremely rare HPV-negative HSIL (Supplementary Figure 3, available online). AIS = adenocarcinoma in situ; AGC = atypical glandular cells; ASC-H = atypical squamous cells rule out high-grade; ASC-US = atypical squamous cells of undetermined significance; HPV = human papillomavirus; HSIL = high-grade squamous intraepithelial lesion; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Figure 2. View largeDownload slide Distribution of risk scores by cytology results among human papillomavirus (HPV)–positive women from the 2016–2017 validation cohort. Summaries were based on risk scores of HPV-positive women from the 2016–2017 cohort. The risk score was predicted by the risk score model trained with the 2010 cohort. The average scores were equally low among HPV-infected women with negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance, or low-grade squamous intraepithelial lesion. Scores tended to increase when the cytologic result indicated atypical squamous cells rule out high-grade, or atypical glandular cells. Elevated scores were observed for women with high-grade squamous intraepithelial lesion (HSIL)/adenocarcinoma in situ (AIS; with relatively very few AIS). Two results of cancer were excluded; both had very high scores. HPV-negative women had uniformly low scores, except for extremely rare HPV-negative HSIL (Supplementary Figure 3, available online). AIS = adenocarcinoma in situ; AGC = atypical glandular cells; ASC-H = atypical squamous cells rule out high-grade; ASC-US = atypical squamous cells of undetermined significance; HPV = human papillomavirus; HSIL = high-grade squamous intraepithelial lesion; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Discussion Cervical screening programs internationally are transitioning to primary HPV testing, with cytology reserved for triage of HPV-positive women. The results of this study showed that a computer algorithm matches or exceeds cytology triage performance, confirming our previous proof-of-principle study conducted in a referral population (4). Thus, the findings strongly support the feasibility of totally automated cervical screening without cytology. Automated cervical screening and triage could be particularly appealing for middle-resource settings, permitting extension of high-quality cervical prevention programs to regions limited by lack of skilled cytology professionals. Automation may also be appealing in the United States and other high-resource settings, as an alternative to conventional cytotechnology practice. Of note, the algorithm proved equivalent in performance to an excellent cytology comparator in deciding which women should be referred to colposcopy. During the study period, KPNC employed several strategies to maximize sensitivity of its SurePath liquid-based cytology approach, including unmasking of HPV status at the time of screening, use of FocalPoint to rank severity of slides within batches, re-reading of a fraction of slides, and review of discrepant cases. Nonetheless, both cytology and the algorithm remain imperfect triage methods. The specificity of either method was not high, and the agreement on individual cases was not complete; each method diagnosed precancer cases missed by the other. Thus, countries adopting primary HPV testing with triage by partial typing and cytology should continue to seek improvement. We would hope to improve the algorithm further, particularly its specificity, by referring fewer women with the lower-risk HPV types. Also, cytology greater than LSIL outperformed the algorithm in identification of the women at very highest risk of precancer. Thus, further improvements to the algorithm will include study of what severe features the algorithm is missing that are being detected by human review. Adenocarcinoma is especially difficult to detect by cytology, including automated cytology; its importance is increasing in well-screened populations, as the burden of squamous cell cancer is reduced. The range of causal HPV types is limited (mainly HPV16, HPV18, and HPV45), but cytologic appearance and correlated automated result can be equivocal or even negative. It will be particularly important to find a way to predict which few women with HPV18/45 are at high risk of cancer, given that these types do not result in high risk of precancer within the first year following detection. The importance of these types as major causes of cancer is manifested over a longer time horizon, reflecting some combination of difficulty of diagnosis of endocervical lesions and the particular biological importance for these types of viral integration prior to progression to precancer/cancer. Our study has two main limitations. First, we cross-validated the algorithm on samples from the same population (the 2010 cohort) with a limited sample size. Its performance in a different population might be varied. We will retrain and validate the model on the much larger 2016–2017 cohort from KPNC in two or three years when enough precancer cases are accumulated. Second, we converted the continuous risk score into a three-level categorical value for a direct comparison with the conventional cytology reading. We can refine the risk group by classifying the score into more levels (such as five or more). But given the limited sample size we had in the 2010 cohort, the resultant risk estimate for each level within a given HPV group might not be reliable. In the future, once we have more histopathologic outcomes from the 2016–2017 cohort, we can obtain risk estimates for more refined subgroups defined by risk score and HPV typing to achieve better risk stratification. The important remaining questions are how much the algorithm can be further improved by targeting the missed cases detected by cytology and whether other automated systems or strategies can outperform this one. Other automatable systems based on different technologies are also in development. The Roche cobas HPV test combined with automated p16-Ki67 dual stain is under development and evaluation (22). A third emerging approach that is earlier in development and does not require making a slide is HPV typing combined with viral and host methylation (23,24). It seems likely that one or more of these techniques will permit automation of primary screening and triage, particularly in middle- or even high-income settings. Future questions will involve the cost-effectiveness of any automated system for middle- and selected low-resource settings, dissemination and implementation, and finding affordable high-quality triage options for low-resource regions. The ultimate goal remains integration of affordable and high-quality screening/triage with vaccination to promote comprehensive cervical cancer control worldwide. Funding The study was funded/directed by the National Cancer Institute Intramural Research Program in collaboration with Kaiser Permanente Northern California. BD provided masked human papillomavirus typing. Notes Affiliations of authors: Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD (KY, NH, HZ, JCG, NW, MS); Kaiser Permanente Northern California Regional Laboratory, Berkeley, CA (BF, TL, RES, NEP); Kaiser Permanente Northern California Division of Research, Oakland, CA (TRR); Information Management Services, Calverton, MD (WW, BB); Albert Einstein College of Medicine, Bronx, NY (PEC). The funders had no role in design of the study; the collection, analysis, or interpretation of the data; the writing of the manuscript; or the decision to submit the manuscript for publication. References 1 Schiffman M , Doorbar J , Wentzensen N et al. , Carcinogenic human papillomavirus infection . Nat Rev Dis Primers. 2016 ; 2 : 16086 . Google Scholar CrossRef Search ADS PubMed 2 Bosch FX , Robles C , Diaz M et al. , HPV-FASTER: Broadening the scope for prevention of HPV-related cancer . Nat Rev Clin Oncol. 2016 ; 13 2 : 119 – 132 . Google Scholar CrossRef Search ADS PubMed 3 Vaccarella S , Laversanne M , Ferlay J et al. , Cervical cancer in Africa, Latin America and the Caribbean and Asia: Regional inequalities and changing trends . Int J Cancer. 2017 ;141(10):1997–2001. 4 Schiffman M , Yu K , Zuna R et al. , Proof-of-principle study of a novel cervical screening and triage strategy: Computer-analyzed cytology to decide which HPV-positive women are likely to have >/=CIN2 . Int J Cancer. 2017 ; 140 3 : 718 – 725 . Google Scholar CrossRef Search ADS PubMed 5 Ronco G , Dillner J , Elfstrom KM et al. , Efficacy of HPV-based screening for prevention of invasive cervical cancer: Follow-up of four European randomised controlled trials . Lancet. 2014 ; 383 9916 : 524 – 532 . Google Scholar CrossRef Search ADS PubMed 6 Wentzensen N , Schiffman M , Palmer T et al. , Triage of HPV positive women in cervical cancer screening . J Clin Virol. 2016 ; 76 ( Suppl 1 ): S49 – S55 . Google Scholar CrossRef Search ADS PubMed 7 Naucler P , Ryd W , Tornberg S et al. , Efficacy of HPV DNA testing with cytology triage and/or repeat HPV DNA testing in primary cervical cancer screening . J Natl Cancer Inst. 2009 ; 101 2 : 88 – 99 . Google Scholar CrossRef Search ADS PubMed 8 Huh WK , Ault KA , Chelmow D et al. , Use of primary high-risk human papillomavirus testing for cervical cancer screening: Interim clinical guidance . Obstet Gynecol. 2015 ; 125 2 : 330 – 337 . Google Scholar CrossRef Search ADS PubMed 9 Luttmer R , De Strooper LM , Berkhof J et al. , Comparing the performance of FAM19A4 methylation analysis, cytology and HPV16/18 genotyping for the detection of cervical (pre)cancer in high-risk HPV-positive women of a gynecologic outpatient population (COMETH study) . Int J Cancer. 2016 ; 138 4 : 992 – 1002 . Google Scholar CrossRef Search ADS PubMed 10 Velentzis LS , Caruana M , Simms KT et al. , How will transitioning from cytology to HPV testing change the balance between the benefits and harms of cervical cancer screening? Estimates of the impact on cervical cancer, treatment rates and adverse obstetric outcomes in Australia, a high vaccination coverage country . Int J Cancer. 2017 ; 141 12 : 2410 – 2422 . Google Scholar CrossRef Search ADS PubMed 11 Schiffman M , Hyun N , Raine-Bennett TR et al. , A cohort study of cervical screening using partial HPV typing and cytology triage . Int J Cancer. 2016 ; 139 11 : 2606 – 2615 . Google Scholar CrossRef Search ADS PubMed 12 Gage JC , Schiffman M , Katki HA et al. , Reassurance against future risk of precancer and cancer conferred by a negative human papillomavirus test . J Natl Cancer Inst. 2014 ; 106 8 :dju153. 13 Castle PE , Gutierrez EC , Leitch SV et al. , Evaluation of a new DNA test for detection of carcinogenic human papillomavirus . J Clin Microbiol. 2011 ; 49 8 : 3029 – 3032 . Google Scholar CrossRef Search ADS PubMed 14 Carreon JD , Sherman ME , Guillen D et al. , CIN2 is a much less reproducible and less valid diagnosis than CIN3: Results from a histological review of population-based cervical samples . Int J Gynecol Pathol. 2007 ; 26 4 : 441 – 446 . Google Scholar CrossRef Search ADS PubMed 15 Kardos TF. The FocalPoint System: FocalPoint slide profiler and FocalPoint GS . Cancer. 2004 ; 102 6 : 334 – 339 . Google Scholar CrossRef Search ADS PubMed 16 Tibshirani R. Regression shrinkage and selection via the lasso . J Royal Statist Soc B. 1996 ; 58 1 : 267 – 288 . 17 Friedman JH , Hastie T , Simon N et al. , Package ‘glmnet.’ https://cran.r-project.org/web/packages/glmnet/glmnet.pdf. Accessed August 1, 2017. 18 Hastie T , Tibshirani R , Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York : Springer ; 2001 . 19 Robin X , Turck N , Hainard A et al. , Package ‘pROC.’ https://cran.r-project.org/web/packages/pROC/pROC.pdf. Assessed August 1, 2017. 20 Hyun N , Chenug L , Pan Q et al. , Flexible risk prediction models for left or interval-censored data from electronic health records . Ann Appl Stat. 2017 ; 11 2 : 1063 – 1084 . Google Scholar CrossRef Search ADS 21 Schiffman M , Kinney WK , Cheung LC et al. , Relative performance of HPV and cytology components of cotesting in cervical screening . J Natl Cancer Inst. 2018 ;110(5):djx225. 22 Wentzensen N , Fetterman B , Castle PE et al. , p16/Ki-67 dual stain cytology for detection of cervical precancer in HPV-positive women . J Natl Cancer Inst. 2015 ; 107 12 :djv257. 23 Lorincz AT , Brentnall AR , Scibior-Bentkowska D et al. , Validation of a DNA methylation HPV triage classifier in a screening sample . Int J Cancer. 2016 ; 138 11 : 2745 – 2751 . Google Scholar CrossRef Search ADS PubMed 24 Wentzensen N , Sun C , Ghosh A et al. , Methylation of HPV18, HPV31, and HPV45 genomes and cervical intraepithelial neoplasia grade 3 . J Natl Cancer Inst. 2012 ; 104 22 : 1738 – 1749 . Google Scholar CrossRef Search ADS PubMed Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JNCI: Journal of the National Cancer Institute Oxford University Press

Loading next page...
 
/lp/ou_press/automated-cervical-screening-and-triage-based-on-hpv-testing-and-0ZgIuf6BW3
Publisher
Oxford University Press
Copyright
Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US.
ISSN
0027-8874
eISSN
1460-2105
D.O.I.
10.1093/jnci/djy044
Publisher site
See Article on Publisher Site

Abstract

Abstract Background State-of-the-art cervical cancer prevention includes human papillomavirus (HPV) vaccination among adolescents and screening/treatment of cervical precancer (CIN3/AIS and, less strictly, CIN2) among adults. HPV testing provides sensitive detection of precancer but, to reduce overtreatment, secondary “triage” is needed to predict women at highest risk. Those with the highest-risk HPV types or abnormal cytology are commonly referred to colposcopy; however, expert cytology services are critically lacking in many regions. Methods To permit completely automatable cervical screening/triage, we designed and validated a novel triage method, a cytologic risk score algorithm based on computer-scanned liquid-based slide features (FocalPoint, BD, Burlington, NC). We compared it with abnormal cytology in predicting precancer among 1839 women testing HPV positive (HC2, Qiagen, Germantown, MD) in 2010 at Kaiser Permanente Northern California (KPNC). Precancer outcomes were ascertained by record linkage. As additional validation, we compared the algorithm prospectively with cytology results among 243 807 women screened at KPNC (2016–2017). All statistical tests were two-sided. Results Among HPV-positive women, the algorithm matched the triage performance of abnormal cytology. Combined with HPV16/18/45 typing (Onclarity, BD, Sparks, MD), the automatable strategy referred 91.7% of HPV-positive CIN3/AIS cases to immediate colposcopy while deferring 38.4% of all HPV-positive women to one-year retesting (compared with 89.1% and 37.4%, respectively, for typing and cytology triage). In the 2016–2017 validation, the predicted risk scores strongly correlated with cytology (P < .001). Conclusions High-quality cervical screening and triage performance is achievable using this completely automated approach. Automated technology could permit extension of high-quality cervical screening/triage coverage to currently underserved regions. Cervical cancer is caused by persistent infection with carcinogenic human papillomaviruses (HPV) (1). Prophylactic HPV vaccination to control transmission is the ultimate prevention strategy; however, vaccination remains a long-term solution due to still-limited coverage and the long latency between infection and cancer development (2). Improving and expanding cervical screening with a focus on increasingly vaccinated cohorts and lower-resource settings will remain a global prevention priority for decades; otherwise, the global burden of cervical cancer will continue to rise (3). It might be possible to expand the reach of screening programs using automation to overcome the chronic shortage of skilled laboratory professionals (4). The main goal of cervical screening is detection of high-grade “precancers” (defined by histopathology of CIN3 and AIS, or less stringently for maximal protection as CIN2/CIN3/AIS), which can be treated to prevent cervical cancer mortality and morbidity. Cervical screening programs include two distinct procedures: screening of the general population and triage of screen-positive women to focus treatment on precancers. For the general screening phase, HPV testing for the carcinogenic types of HPV is gradually replacing cytology (Pap testing), because HPV testing is more sensitive for detection of precancer with more sustained negative predictive value (5). It is also more reproducible and adaptable to increasingly HPV-vaccinated populations. However, a pressing and unsolved problem is how best to identify the small minority of 5% to 10% of women (or more) testing positive with HPV screening who are most likely to have precancer requiring treatments. Most women control/clear their infections and do not need treatment, which involves ablation or excision of the ring of cervical tissue especially susceptible to HPV-induced carcinogenesis (1). It is partly the cost and complexity of triage testing needed to avoid overtreatment that is impeding widespread adoption of HPV screening programs (6). The leading triage strategy in higher-resource regions combines partial HPV typing (to identify the highest-risk types) and cytology (used in this case as a second test among HPV-positive women) (7–10). Conventional cytology, which is currently combined with HPV testing in most US screening programs to determine which women should go to colposcopy, is labor-intensive. Use of computer-interpreted cytology for triage would permit the automation of the whole screening process. Here we report the design and evaluation of such a fully automatable cervical screening strategy to determine whether an automated triage algorithm could triage HPV-positive women as accurately as conventionally interpreted liquid-based SurePath cytology (BD, Burlington, NC). Methods Study Design and Participants The study was conducted in two parts, both using data from the Kaiser Permanente Northern California (KPNC) cervical screening program: 1) development and validation of the risk score algorithm predicting precancer in a cohort analysis of women cotested with HPV and computer-scanned liquid-based cytology in 2010 and 2) validation of the association between higher risk scores and HPV-positive, abnormal cytology cotest results, in a large cohort of women screened in 2016–2017. We developed and initially validated the computer algorithm based on stored, residual cervical specimens and liquid-based cytology slides from the Persistence and Progression (PaP) study conducted within the KPNC cervical screening program, which has used cytology-HPV cotesting since 2003 (11,12). The NCI-KPNC PaP study was formed by collection, neutralization, and freezing of discarded aliquots of routinely collected cervical specimens in STM buffer tested for HPV by HC2 (Qiagen, Germantown, MD). The baseline collection included 54 723 women with cotesting in 2007–2011. When women in the PaP cohort returned for later cotest visits, additional discard specimens were collected and stored. Clinical outcomes of infections were ascertained by merging data from the KPNC cytopathology, histopathology, and HPV testing files, complete at the time of analysis through 2015. To develop the risk score algorithm, we used HPV test specimens and the corresponding stored cotest liquid-based cytology slides (SurePath, BD Diagnostics, Sparks, MD) from the subgroup of women in the PaP cohort with an HC2-positive baseline or follow-up visit in 2010. The final analytic set (called the 2010 cohort) consisted of 1839 subjects. We considered separately the subset of 20 cases with histopathologic outcomes of invasive cancer, because we were mainly studying screening to find treatable precancer. The flowchart for obtaining the final set is given in Supplementary Figure 1 (available online). To further validate the risk score algorithm, we used a set of 243 807 scanned SurePath slides from routine cervical screening at KPNC in late 2016 to mid-2017 (called the 2016–2017 cohort). All specimen use in this study has been approved by both the KPNC and National Cancer Institute Institutional Review Boards. Use of masked discard specimens was judged not to require written informed consent; women could opt out of use of their specimens (∼8% did). Partial HPV Typing We conducted partial HPV typing of the HC2-positive women from 2010 stored aliquots using Onclarity (Becton Dickinson Diagnostics, Sparks, MD), a nine-channel DNA amplification and detection assay recently approved by the US Food and Drug Administration for HPV testing and separate identification of HPV16, HPV18, and HPV45 (13). Here we combined the nine HPV type channel results into four HPV groups according to the earlier work (11). The pre-established, hierarchical order from highest to low risk of precancer/cancer was HPV16, else HPV31/33/52/58, else HPV18/45, else HPV35/39/51/56/59/66/68. Histopathology Outcomes In the analysis of the 2010 cohort, we first ascertained the worst histologic diagnosis made by KPNC pathologists during follow-up from the time of the screening cotest to the end of 2015 (median = 3.7 years, interquartile range = 2.3–4.6 years). We chose CIN2/CIN3/AIS as the measurement of precancer for algorithm training, because it represents the treatment threshold in this age group at KPNC. When evaluating the performance of the established algorithm, we also considered CIN3/AIS as the more stringent histopathologic diagnoses better representing true precancer (14), although the numbers of cases were more limited. For the absolute risk estimate, we focused on three-year risk of precancer to include an entire screening cycle. Statistical Analysis We used the likelihood ratio test based on the linear regression model to compare risk score distributions among different groups, the likelihood ratio test based on the logistic regression model to evaluate the association between a risk factor and the case–control status, and the McNemar’s test to compare sensitivity and specificity between two given diagnosis rules. All statistical tests were two-sided, and a P value of less than .05 was considered statistically significant. Development of the Risk Score Algorithm We developed a novel risk score algorithm (called “the algorithm”) to triage HPV-positive women using a redesign of BD FocalPoint cytology to target precancer. FocalPoint is a slide scanner that performs a high-speed image capture and outputs 160 binary or quantitative “features” of the cytology slide such as presence of different cell types, nuclear size, and nuclear contour (15). The severity rank generated by the commercial algorithm is currently designed to identify the most innocuous slides in a batch, to reduce and/or guide cytotechnologist workload. Details of the new algorithm that repurposed FocalPoint optical scanning features to target precancer instead are outlined in the Supplementary Methods (available online). Briefly, based on the 2010 cohort of 1839 subjects, we adopted the least absolute shrinkage and selection operation (LASSO) model (16) implemented in the R package glmnet (17) to create the risk score predicting the likelihood of the presence or imminent development of precancer (CIN2/CIN3/AIS) among HPV-positive women. To have an unbiased evaluation of the risk score using the same training set, we employed the leave one out cross-validation (LOOCV) procedure (18) to produce an unbiased estimated risk score for each subject for subsequent analyses of the 2010 cohort. Comparing the Algorithm With Cytology Cytology can divide HPV-positive women into one of three categories to reflect the underlying risk for precancer: using Bethesda System terminology, they are normal cytology (NILM), minor HPV-related abnormalities (ASC-US/LSIL), or higher-risk results (>LSIL, ie, AGC/ASC-H/HSIL/AIS/cancer). Most HPV-based screening programs refer to colposcopy all HPV-positive women whose reflex cytology is not normal (ie, ASC-US or worse). To make a direct comparison between the performances of the algorithm and cytology in stratifying risk of precancer among HPV-positive women, we chose two cut-points in the automated risk score to produce three risk groups (called high, intermediate, and low risk) of exactly same sizes as the ordinary cytology strategy of >LSIL, ASC-US/LSIL, NILM. We used the R package pROC (19) to calculate receiver operating characteristic (ROC) curves to show the sensitivity/specificity trade-offs for the risk score. To compare the algorithm with cytology, we conducted the McNemar’s test to compare sensitivities for the detection of precancer achieved by the two strategies at the ASC-US or worse threshold (intermediate- or high-risk score), or >LSIL threshold (high-risk score). Assessing the Algorithm and Partial HPV Typing as a Triage Strategy We compared absolute risks for precancer in the 12 groups of HPV-positive women formed by the combination of four HPV type groups and three cytology or corresponding risk score groups. Using a novel logistic Cox model (20), we estimated three-year cumulative risks (including the risk for precancers present at the time of screening) of CIN3/AIS or CIN2/CIN3/AIS for the 12 strata (Supplementary Methods, available online). We chose this model because of interval censoring and existence of prevalent events. To extend the analysis to the whole KPNC screening population, we also estimated precancer risk for HPV-negative women based on women age 30 years and older from KPNC who tested negative by HC2 in 2012 (n = 239 948). In current practice in several countries, HPV-positive women are managed according to a combination of partial HPV typing, when available, and cytology (7–10). Women with the highest-risk HPV types (HPV16, HPV18, and sometimes HPV45) and/or abnormal cytology are referred to colposcopy; the remaining HPV-positive women are retested (at one year in the United States). For such a strategy, we compared the triage procedures (ASC-US or worse cytology vs intermediate-risk score threshold), estimating the percentage of precancers occurring within the subsequent three years that they each would have referred. The P value for this comparison was evaluated through a standard bootstrap procedure. Further Validation in the 2016–2017 Cohort We applied the established algorithm, trained with the 2010 cohort of 1839 subjects, to FocalPoint output obtained from 243 807 slides performed during routine cervical screening at KPNC in late 2016 to mid-2017. We validated the algorithm results against HPV/cytology cotest results obtained from those same screening visits. We used the Spearman’s rank correlation coefficient test for this validation. Results Analysis of 2010 Cohort The 2010 cohort consisted of 1839 HPV-positive women, including the 1529 <CIN2 controls (including women with no referral to colposcopy, or colposcopy/biopsy diagnosis of <CIN2) and 310 cases (181 CIN2 and 129 CIN3/AIS). More detailed baseline data are given in Supplementary Table 1 (available online). Based on the entire 2010 cohort, the final LASSO model for the risk score selected 55 features (Supplementary Table 2, available online). The distributions of the risk score (unbiased estimate by the LOOCV procedure), stratified by cytology and histopathology, are summarized in Supplementary Figure 2 (available online). The cytology results of NILM, ASC-US/LSIL, and >LSIL represented 49%, 45%, and 5.9% of samples in the 2010 cohort. By design, the same proportions of subjects were divided into three risk groups (low, intermediate, or high) according to their risk scores. Table 1 shows the number of control and case subjects falling into each group for each strategy, individually and jointly. The three risk groups defined by each strategy were strongly associated with case–control status (P < .001). Although the algorithm and cytology showed good agreement, each generated further risk stratification. For example, after adjusting for the standard cytology strategy, the risk score–based risk groups were still strongly associated with case–control status (P < .001). Table 1. Concordance between the risk groups based on the cytology result and the risk score, in the 2010 cohort of 1839 HPV-positive women, by case–control status* Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) * The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the 2010 cohort (904, 826, 109). ASC-US = atypical squamous cells of undetermined significance; CIN2 = cervical intraepithelial neoplasia grade II; HPV = human papillomavirus; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. † Number of ≥CIN2 cases (number of <CIN2 controls) occurring within three years of screening tests. Table 1. Concordance between the risk groups based on the cytology result and the risk score, in the 2010 cohort of 1839 HPV-positive women, by case–control status* Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) Cytology-based risk group Risk score group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) * The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the 2010 cohort (904, 826, 109). ASC-US = atypical squamous cells of undetermined significance; CIN2 = cervical intraepithelial neoplasia grade II; HPV = human papillomavirus; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. † Number of ≥CIN2 cases (number of <CIN2 controls) occurring within three years of screening tests. Figure 1A shows the ROC curve based on the risk score for diagnosis of CIN2/CIN3/AIS, with an area under the curve (AUC) of 0.71 (95% confidence interval [CI] = 0.68 to 0.74). Also shown on the figure are the comparisons of sensitivity/specificity at thresholds corresponding to ≥ASC-US and >LSIL. The risk score corresponding to ≥ASC-US (ie, low- vs intermediate/high-risk groups) had a sensitivity of 0.75 (95% CI = 0.70 to 0.80) and specificity of 0.54 (95% CI = 0.51 to 0.57), both of which were not statistically significantly higher than their counterparts based on the standard cytology strategy, which had a sensitivity of 0.71 (95% CI = 0.66 to 0.76) and a specificity of 0.53 (95% CI = 0.50 to 0.56, P = .16 and .59 for sensitivity and specificity comparisons, respectively). When using the threshold corresponding to >LSIL (meant to define a small highest-risk group), the algorithm had statistically nonsignificantly lower sensitivity/specificity than cytology (P = .08 and .11 for sensitivity and specificity comparisons, respectively). Figure 1. View largeDownload slide The automated risk score receiver operating characteristic (ROC) curve for the detection of cases among human papillomavirus (HPV)–positive women and sensitivity/specificity comparisons between risk groups derived from the automated risk score and conventional cytology result. A) For the detection of cervical intraepithelial neoplasia grade II (CIN2)/cervical intraepithelial neoplasia grade III (CIN3)/adenocarcinoma in situ (AIS). B) For the detection of CIN3/AIS. Results were based on the 2010 cohort, with risk score unbiasedly estimated by the leave one out cross-validation (LOOCV) procedure. The ROC curve plots the trade-off of sensitivity and specificity for increasingly elevated risk scores. Both areas under the curve reflected good overall discrimination between 310 CIN2/CIN3/AIS cases (or 190 CIN3/AIS cases) and 1529 controls (<CIN2), with better area under the curve (AUC) when cases were defined more stringently as CIN3/AIS (excluding CIN2). Direct comparison of the conventional cytology strategy with the algorithm strategy was achieved by first dividing risk according to cytology into three groups (negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance/low-grade squamous intraepithelial lesion [LSIL], and >LSIL, see the Methods for details). The risk scores were divided at two cut-points to create three risk groups of the same sizes as the cytology. The sensitivity and specificity of the two strategies at those two cut-points were compared. AUC = area under the curve; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Figure 1. View largeDownload slide The automated risk score receiver operating characteristic (ROC) curve for the detection of cases among human papillomavirus (HPV)–positive women and sensitivity/specificity comparisons between risk groups derived from the automated risk score and conventional cytology result. A) For the detection of cervical intraepithelial neoplasia grade II (CIN2)/cervical intraepithelial neoplasia grade III (CIN3)/adenocarcinoma in situ (AIS). B) For the detection of CIN3/AIS. Results were based on the 2010 cohort, with risk score unbiasedly estimated by the leave one out cross-validation (LOOCV) procedure. The ROC curve plots the trade-off of sensitivity and specificity for increasingly elevated risk scores. Both areas under the curve reflected good overall discrimination between 310 CIN2/CIN3/AIS cases (or 190 CIN3/AIS cases) and 1529 controls (<CIN2), with better area under the curve (AUC) when cases were defined more stringently as CIN3/AIS (excluding CIN2). Direct comparison of the conventional cytology strategy with the algorithm strategy was achieved by first dividing risk according to cytology into three groups (negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance/low-grade squamous intraepithelial lesion [LSIL], and >LSIL, see the Methods for details). The risk scores were divided at two cut-points to create three risk groups of the same sizes as the cytology. The sensitivity and specificity of the two strategies at those two cut-points were compared. AUC = area under the curve; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. When cases were restricted to CIN3/AIS (n = 129), with controls still being less than CIN2, the AUC of the risk score was basically unchanged, increasing slightly from 0.71 to 0.74 (95% CI = 0.69 to 0.78) (Figure 1B). The algorithm showed a marginal sensitivity advantage over the corresponding cytology threshold >ASC-US (P = .09). We illustrated how the automated screening and triage strategy, which combined HPV testing, partial HPV typing, and the algorithm might work in real practice within the total screened population including HPV-negative as well as HPV-positive women, for the more important diagnosis of CIN3/AIS (Table 2) and, secondarily, for CIN2/CIN3/AIS (Supplementary Table 3, available online). Details on how those calculations were done are given in Supplementary Methods (available online). For general screening, approximately 93% of women would fall into the HPV-negative group (as estimated by HC2 negativity in the full KPNC cohort tested in 2010). Although HPV-positive women are the focus of this analysis, it is noteworthy that a single negative HPV test would predict a very low absolute risk of precancer in the subsequent years (21), but within this large group, 9.9% of CIN3/AIS would be missed. Table 2. Estimated three-year cumulative risk of CIN3/AIS given the HPV typing group and the risk score group/cytology group* HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening * The risk score was estimated by a leave-one-out cross-validated LASSO model that integrated FocalPoint scanned features. The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the analytic set (49%, 45%, and 6%). The risk was estimated by the logistic Cox model. See the Supplementary Methods (available online) for more details on how estimates in the table were obtained. AIS = adenocarcinoma in situ; ASC-US = atypical squamous cells of undetermined significance; CIN3 = cervical intraepithelial neoplasia grade III; HPV = human papillomavirus; LASSO = least absolute shrinkage and selection operator; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Table 2. Estimated three-year cumulative risk of CIN3/AIS given the HPV typing group and the risk score group/cytology group* HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening HPV status and risk group Automated strategy Conventional strategy Score level % of population 3-y risk % cases diagnosed Cytology group % of population 3-y risk % cases diagnosed Action required HPV-positive 7 0.0724 90.1 7 0.0724 90.1  HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral  HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest  HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral  HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening * The risk score was estimated by a leave-one-out cross-validated LASSO model that integrated FocalPoint scanned features. The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the analytic set (49%, 45%, and 6%). The risk was estimated by the logistic Cox model. See the Supplementary Methods (available online) for more details on how estimates in the table were obtained. AIS = adenocarcinoma in situ; ASC-US = atypical squamous cells of undetermined significance; CIN3 = cervical intraepithelial neoplasia grade III; HPV = human papillomavirus; LASSO = least absolute shrinkage and selection operator; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. In triage of the 7% of women testing HPV-positive, the automated and conventional strategies would both refer to colposcopy those women with HPV16, HPV18, or HPV45. As the major point of comparison, those with abnormal cytology or intermediate/high-risk score also would be referred to colposcopy, while women with cytologic result of NILM (or low-risk score group) and without HPV16, HPV18, or HPV45 would be retested at one year. In the whole screened population, the conventional strategy would refer 4.4% of screened women for immediate colposcopy; included in this referred group would be 80.2% of women with prevalent or incident (within three years) CIN3/AIS. The automated strategy would refer a similar proportion of the total population (4.3%) and catch a slightly higher percentage of CIN3/AIS (82.6%, bootstrap estimated P = .54). Focusing on HPV-positive women, the automated strategy would defer 38.4% of them to one-year retesting, while referring the remaining for colposcopy, catching 91.7% of HPV-positive CIN3/AIS cases (compared with 37.4% and 89.1%, respectively, for the conventional strategy). The two strategies also had a comparable performance for the triage of CIN2/CIN3/AIS (Supplementary Table 3, available online). Characteristics on those 20 cases of invasive cancer that were excluded from the 2010 cohort are given in the Supplementary Methods and Supplementary Table 4 (available online). 2016–2017 Cohort Analysis Using the risk prediction model established with the 2010 cohort, we obtained the risk score on the 243 807 women in the 2016–2017 validation cohort. Figure 2 shows the distributions of risk scores by cytology result among 22 732 HPV-positive women; the association of elevated algorithm scores and severe cytology result was strong and highly statistically significant (P < .001, based on Spearman’s rank correlation test). The scores tended to be low for all HPV-negative slides except for the very rare HPV-negative ASC-H/AGC/HSIL (Supplementary Figure 3, available online). Figure 2. View largeDownload slide Distribution of risk scores by cytology results among human papillomavirus (HPV)–positive women from the 2016–2017 validation cohort. Summaries were based on risk scores of HPV-positive women from the 2016–2017 cohort. The risk score was predicted by the risk score model trained with the 2010 cohort. The average scores were equally low among HPV-infected women with negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance, or low-grade squamous intraepithelial lesion. Scores tended to increase when the cytologic result indicated atypical squamous cells rule out high-grade, or atypical glandular cells. Elevated scores were observed for women with high-grade squamous intraepithelial lesion (HSIL)/adenocarcinoma in situ (AIS; with relatively very few AIS). Two results of cancer were excluded; both had very high scores. HPV-negative women had uniformly low scores, except for extremely rare HPV-negative HSIL (Supplementary Figure 3, available online). AIS = adenocarcinoma in situ; AGC = atypical glandular cells; ASC-H = atypical squamous cells rule out high-grade; ASC-US = atypical squamous cells of undetermined significance; HPV = human papillomavirus; HSIL = high-grade squamous intraepithelial lesion; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Figure 2. View largeDownload slide Distribution of risk scores by cytology results among human papillomavirus (HPV)–positive women from the 2016–2017 validation cohort. Summaries were based on risk scores of HPV-positive women from the 2016–2017 cohort. The risk score was predicted by the risk score model trained with the 2010 cohort. The average scores were equally low among HPV-infected women with negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance, or low-grade squamous intraepithelial lesion. Scores tended to increase when the cytologic result indicated atypical squamous cells rule out high-grade, or atypical glandular cells. Elevated scores were observed for women with high-grade squamous intraepithelial lesion (HSIL)/adenocarcinoma in situ (AIS; with relatively very few AIS). Two results of cancer were excluded; both had very high scores. HPV-negative women had uniformly low scores, except for extremely rare HPV-negative HSIL (Supplementary Figure 3, available online). AIS = adenocarcinoma in situ; AGC = atypical glandular cells; ASC-H = atypical squamous cells rule out high-grade; ASC-US = atypical squamous cells of undetermined significance; HPV = human papillomavirus; HSIL = high-grade squamous intraepithelial lesion; LSIL = low-grade squamous intraepithelial lesion; NILM = negative for intraepithelial lesion or malignancy. Discussion Cervical screening programs internationally are transitioning to primary HPV testing, with cytology reserved for triage of HPV-positive women. The results of this study showed that a computer algorithm matches or exceeds cytology triage performance, confirming our previous proof-of-principle study conducted in a referral population (4). Thus, the findings strongly support the feasibility of totally automated cervical screening without cytology. Automated cervical screening and triage could be particularly appealing for middle-resource settings, permitting extension of high-quality cervical prevention programs to regions limited by lack of skilled cytology professionals. Automation may also be appealing in the United States and other high-resource settings, as an alternative to conventional cytotechnology practice. Of note, the algorithm proved equivalent in performance to an excellent cytology comparator in deciding which women should be referred to colposcopy. During the study period, KPNC employed several strategies to maximize sensitivity of its SurePath liquid-based cytology approach, including unmasking of HPV status at the time of screening, use of FocalPoint to rank severity of slides within batches, re-reading of a fraction of slides, and review of discrepant cases. Nonetheless, both cytology and the algorithm remain imperfect triage methods. The specificity of either method was not high, and the agreement on individual cases was not complete; each method diagnosed precancer cases missed by the other. Thus, countries adopting primary HPV testing with triage by partial typing and cytology should continue to seek improvement. We would hope to improve the algorithm further, particularly its specificity, by referring fewer women with the lower-risk HPV types. Also, cytology greater than LSIL outperformed the algorithm in identification of the women at very highest risk of precancer. Thus, further improvements to the algorithm will include study of what severe features the algorithm is missing that are being detected by human review. Adenocarcinoma is especially difficult to detect by cytology, including automated cytology; its importance is increasing in well-screened populations, as the burden of squamous cell cancer is reduced. The range of causal HPV types is limited (mainly HPV16, HPV18, and HPV45), but cytologic appearance and correlated automated result can be equivocal or even negative. It will be particularly important to find a way to predict which few women with HPV18/45 are at high risk of cancer, given that these types do not result in high risk of precancer within the first year following detection. The importance of these types as major causes of cancer is manifested over a longer time horizon, reflecting some combination of difficulty of diagnosis of endocervical lesions and the particular biological importance for these types of viral integration prior to progression to precancer/cancer. Our study has two main limitations. First, we cross-validated the algorithm on samples from the same population (the 2010 cohort) with a limited sample size. Its performance in a different population might be varied. We will retrain and validate the model on the much larger 2016–2017 cohort from KPNC in two or three years when enough precancer cases are accumulated. Second, we converted the continuous risk score into a three-level categorical value for a direct comparison with the conventional cytology reading. We can refine the risk group by classifying the score into more levels (such as five or more). But given the limited sample size we had in the 2010 cohort, the resultant risk estimate for each level within a given HPV group might not be reliable. In the future, once we have more histopathologic outcomes from the 2016–2017 cohort, we can obtain risk estimates for more refined subgroups defined by risk score and HPV typing to achieve better risk stratification. The important remaining questions are how much the algorithm can be further improved by targeting the missed cases detected by cytology and whether other automated systems or strategies can outperform this one. Other automatable systems based on different technologies are also in development. The Roche cobas HPV test combined with automated p16-Ki67 dual stain is under development and evaluation (22). A third emerging approach that is earlier in development and does not require making a slide is HPV typing combined with viral and host methylation (23,24). It seems likely that one or more of these techniques will permit automation of primary screening and triage, particularly in middle- or even high-income settings. Future questions will involve the cost-effectiveness of any automated system for middle- and selected low-resource settings, dissemination and implementation, and finding affordable high-quality triage options for low-resource regions. The ultimate goal remains integration of affordable and high-quality screening/triage with vaccination to promote comprehensive cervical cancer control worldwide. Funding The study was funded/directed by the National Cancer Institute Intramural Research Program in collaboration with Kaiser Permanente Northern California. BD provided masked human papillomavirus typing. Notes Affiliations of authors: Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD (KY, NH, HZ, JCG, NW, MS); Kaiser Permanente Northern California Regional Laboratory, Berkeley, CA (BF, TL, RES, NEP); Kaiser Permanente Northern California Division of Research, Oakland, CA (TRR); Information Management Services, Calverton, MD (WW, BB); Albert Einstein College of Medicine, Bronx, NY (PEC). The funders had no role in design of the study; the collection, analysis, or interpretation of the data; the writing of the manuscript; or the decision to submit the manuscript for publication. References 1 Schiffman M , Doorbar J , Wentzensen N et al. , Carcinogenic human papillomavirus infection . Nat Rev Dis Primers. 2016 ; 2 : 16086 . Google Scholar CrossRef Search ADS PubMed 2 Bosch FX , Robles C , Diaz M et al. , HPV-FASTER: Broadening the scope for prevention of HPV-related cancer . Nat Rev Clin Oncol. 2016 ; 13 2 : 119 – 132 . Google Scholar CrossRef Search ADS PubMed 3 Vaccarella S , Laversanne M , Ferlay J et al. , Cervical cancer in Africa, Latin America and the Caribbean and Asia: Regional inequalities and changing trends . Int J Cancer. 2017 ;141(10):1997–2001. 4 Schiffman M , Yu K , Zuna R et al. , Proof-of-principle study of a novel cervical screening and triage strategy: Computer-analyzed cytology to decide which HPV-positive women are likely to have >/=CIN2 . Int J Cancer. 2017 ; 140 3 : 718 – 725 . Google Scholar CrossRef Search ADS PubMed 5 Ronco G , Dillner J , Elfstrom KM et al. , Efficacy of HPV-based screening for prevention of invasive cervical cancer: Follow-up of four European randomised controlled trials . Lancet. 2014 ; 383 9916 : 524 – 532 . Google Scholar CrossRef Search ADS PubMed 6 Wentzensen N , Schiffman M , Palmer T et al. , Triage of HPV positive women in cervical cancer screening . J Clin Virol. 2016 ; 76 ( Suppl 1 ): S49 – S55 . Google Scholar CrossRef Search ADS PubMed 7 Naucler P , Ryd W , Tornberg S et al. , Efficacy of HPV DNA testing with cytology triage and/or repeat HPV DNA testing in primary cervical cancer screening . J Natl Cancer Inst. 2009 ; 101 2 : 88 – 99 . Google Scholar CrossRef Search ADS PubMed 8 Huh WK , Ault KA , Chelmow D et al. , Use of primary high-risk human papillomavirus testing for cervical cancer screening: Interim clinical guidance . Obstet Gynecol. 2015 ; 125 2 : 330 – 337 . Google Scholar CrossRef Search ADS PubMed 9 Luttmer R , De Strooper LM , Berkhof J et al. , Comparing the performance of FAM19A4 methylation analysis, cytology and HPV16/18 genotyping for the detection of cervical (pre)cancer in high-risk HPV-positive women of a gynecologic outpatient population (COMETH study) . Int J Cancer. 2016 ; 138 4 : 992 – 1002 . Google Scholar CrossRef Search ADS PubMed 10 Velentzis LS , Caruana M , Simms KT et al. , How will transitioning from cytology to HPV testing change the balance between the benefits and harms of cervical cancer screening? Estimates of the impact on cervical cancer, treatment rates and adverse obstetric outcomes in Australia, a high vaccination coverage country . Int J Cancer. 2017 ; 141 12 : 2410 – 2422 . Google Scholar CrossRef Search ADS PubMed 11 Schiffman M , Hyun N , Raine-Bennett TR et al. , A cohort study of cervical screening using partial HPV typing and cytology triage . Int J Cancer. 2016 ; 139 11 : 2606 – 2615 . Google Scholar CrossRef Search ADS PubMed 12 Gage JC , Schiffman M , Katki HA et al. , Reassurance against future risk of precancer and cancer conferred by a negative human papillomavirus test . J Natl Cancer Inst. 2014 ; 106 8 :dju153. 13 Castle PE , Gutierrez EC , Leitch SV et al. , Evaluation of a new DNA test for detection of carcinogenic human papillomavirus . J Clin Microbiol. 2011 ; 49 8 : 3029 – 3032 . Google Scholar CrossRef Search ADS PubMed 14 Carreon JD , Sherman ME , Guillen D et al. , CIN2 is a much less reproducible and less valid diagnosis than CIN3: Results from a histological review of population-based cervical samples . Int J Gynecol Pathol. 2007 ; 26 4 : 441 – 446 . Google Scholar CrossRef Search ADS PubMed 15 Kardos TF. The FocalPoint System: FocalPoint slide profiler and FocalPoint GS . Cancer. 2004 ; 102 6 : 334 – 339 . Google Scholar CrossRef Search ADS PubMed 16 Tibshirani R. Regression shrinkage and selection via the lasso . J Royal Statist Soc B. 1996 ; 58 1 : 267 – 288 . 17 Friedman JH , Hastie T , Simon N et al. , Package ‘glmnet.’ https://cran.r-project.org/web/packages/glmnet/glmnet.pdf. Accessed August 1, 2017. 18 Hastie T , Tibshirani R , Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York : Springer ; 2001 . 19 Robin X , Turck N , Hainard A et al. , Package ‘pROC.’ https://cran.r-project.org/web/packages/pROC/pROC.pdf. Assessed August 1, 2017. 20 Hyun N , Chenug L , Pan Q et al. , Flexible risk prediction models for left or interval-censored data from electronic health records . Ann Appl Stat. 2017 ; 11 2 : 1063 – 1084 . Google Scholar CrossRef Search ADS 21 Schiffman M , Kinney WK , Cheung LC et al. , Relative performance of HPV and cytology components of cotesting in cervical screening . J Natl Cancer Inst. 2018 ;110(5):djx225. 22 Wentzensen N , Fetterman B , Castle PE et al. , p16/Ki-67 dual stain cytology for detection of cervical precancer in HPV-positive women . J Natl Cancer Inst. 2015 ; 107 12 :djv257. 23 Lorincz AT , Brentnall AR , Scibior-Bentkowska D et al. , Validation of a DNA methylation HPV triage classifier in a screening sample . Int J Cancer. 2016 ; 138 11 : 2745 – 2751 . Google Scholar CrossRef Search ADS PubMed 24 Wentzensen N , Sun C , Ghosh A et al. , Methylation of HPV18, HPV31, and HPV45 genomes and cervical intraepithelial neoplasia grade 3 . J Natl Cancer Inst. 2012 ; 104 22 : 1738 – 1749 . Google Scholar CrossRef Search ADS PubMed Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

JNCI: Journal of the National Cancer InstituteOxford University Press

Published: Apr 11, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off