Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Automated Cervical Screening and Triage, Based on HPV Testing and Computer-Interpreted Cytology

Automated Cervical Screening and Triage, Based on HPV Testing and Computer-Interpreted Cytology Background: State-of-the-art cervical cancer prevention includes human papillomavirus (HPV) vaccination among adolescents and screening/treatment of cervical precancer (CIN3/AIS and, less strictly, CIN2) among adults. HPV testing provides sensitive detection of precancer but, to reduce overtreatment, secondary “triage” is needed to predict women at highest risk. Those with the highest-risk HPV types or abnormal cytology are commonly referred to colposcopy; however, expert cytology services are critically lacking in many regions. Methods: To permit completely automatable cervical screening/triage, we designed and validated a novel triage method, a cytologic risk score algorithm based on computer-scanned liquid-based slide features (FocalPoint, BD, Burlington, NC). We compared it with abnormal cytology in predicting precancer among 1839 women testing HPV positive (HC2, Qiagen, Germantown, MD) in 2010 at Kaiser Permanente Northern California (KPNC). Precancer outcomes were ascertained by record linkage. As additional validation, we compared the algorithm prospectively with cytology results among 243 807 women screened at KPNC (2016–2017). All statistical tests were two-sided. Results: Among HPV-positive women, the algorithm matched the triage performance of abnormal cytology. Combined with HPV16/18/45 typing (Onclarity, BD, Sparks, MD), the automatable strategy referred 91.7% of HPV-positive CIN3/AIS cases to immediate colposcopy while deferring 38.4% of all HPV-positive women to one-year retesting (compared with 89.1% and 37.4%, respectively, for typing and cytology triage). In the 2016–2017 validation, the predicted risk scores strongly correlated with cytology (P < .001). Conclusions: High-quality cervical screening and triage performance is achievable using this completely automated approach. Automated technology could permit extension of high-quality cervical screening/triage coverage to currently un- derserved regions. Cervical cancer is caused by persistent infection with carcino- priority for decades; otherwise, the global burden of cervical genic human papillomaviruses (HPV) (1). Prophylactic HPV vac- cancer will continue to rise (3). It might be possible to expand cination to control transmission is the ultimate prevention the reach of screening programs using automation to overcome strategy; however, vaccination remains a long-term solution the chronic shortage of skilled laboratory professionals (4). due to still-limited coverage and the long latency between infec- The main goal of cervical screening is detection of high- tion and cancer development (2). Improving and expanding cer- grade “precancers” (defined by histopathology of CIN3 and AIS, vical screening with a focus on increasingly vaccinated cohorts or less stringently for maximal protection as CIN2/CIN3/AIS), and lower-resource settings will remain a global prevention which can be treated to prevent cervical cancer mortality and Received: December 5, 2017; Revised: January 17, 2018; Accepted: February 21, 2018 Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US. 1222 Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 K. Yu et al. | 1223 morbidity. Cervical screening programs include two distinct pro- subset of 20 cases with histopathologic outcomes of invasive cedures: screening of the general population and triage of screen- cancer, because we were mainly studying screening to find positive women to focus treatment on precancers. For the gen- treatable precancer. The flowchart for obtaining the final set is eral screening phase, HPV testing for the carcinogenic types of given in Supplementary Figure 1 (available online). HPV is gradually replacing cytology (Pap testing), because HPV To further validate the risk score algorithm, we used a set of testing is more sensitive for detection of precancer with more 243 807 scanned SurePath slides from routine cervical screening sustained negative predictive value (5). It is also more reproduc- at KPNC in late 2016 to mid-2017 (called the 2016–2017 cohort). ible and adaptable to increasingly HPV-vaccinated populations. All specimen use in this study has been approved by both However, a pressing and unsolved problem is how best to the KPNC and National Cancer Institute Institutional Review identify the small minority of 5% to 10% of women (or more) Boards. Use of masked discard specimens was judged not to re- testing positive with HPV screening who are most likely to have quire written informed consent; women could opt out of use of precancer requiring treatments. Most women control/clear their their specimens (8% did). infections and do not need treatment, which involves ablation or excision of the ring of cervical tissue especially susceptible to Partial HPV Typing HPV-induced carcinogenesis (1). It is partly the cost and com- plexity of triage testing needed to avoid overtreatment that is We conducted partial HPV typing of the HC2-positive women impeding widespread adoption of HPV screening programs (6). from 2010 stored aliquots using Onclarity (Becton Dickinson The leading triage strategy in higher-resource regions com- Diagnostics, Sparks, MD), a nine-channel DNA amplification bines partial HPV typing (to identify the highest-risk types) and and detection assay recently approved by the US Food and Drug cytology (used in this case as a second test among HPV-positive Administration for HPV testing and separate identification of women) (7–10). Conventional cytology, which is currently com- HPV16, HPV18, and HPV45 (13). Here we combined the nine HPV bined with HPV testing in most US screening programs to deter- type channel results into four HPV groups according to the ear- mine which women should go to colposcopy, is labor-intensive. lier work (11). The pre-established, hierarchical order from high- Use of computer-interpreted cytology for triage would permit the est to low risk of precancer/cancer was HPV16, else HPV31/33/ automation of the whole screening process. Here we report the 52/58, else HPV18/45, else HPV35/39/51/56/59/66/68. design and evaluation of such a fully automatable cervical screen- ing strategy to determine whether an automated triage algorithm could triage HPV-positive women as accurately as conventionally Histopathology Outcomes interpreted liquid-based SurePath cytology (BD, Burlington, NC). In the analysis of the 2010 cohort, we first ascertained the worst histologic diagnosis made by KPNC pathologists during follow- up from the time of the screening cotest to the end of 2015 (me- Methods dian ¼ 3.7 years, interquartile range ¼ 2.3–4.6 years). We chose CIN2/CIN3/AIS as the measurement of precancer for algorithm Study Design and Participants training, because it represents the treatment threshold in this The study was conducted in two parts, both using data from the age group at KPNC. When evaluating the performance of the Kaiser Permanente Northern California (KPNC) cervical screen- established algorithm, we also considered CIN3/AIS as the more ing program: 1) development and validation of the risk score al- stringent histopathologic diagnoses better representing true gorithm predicting precancer in a cohort analysis of women precancer (14), although the numbers of cases were more lim- cotested with HPV and computer-scanned liquid-based cytology ited. For the absolute risk estimate, we focused on three-year in 2010 and 2) validation of the association between higher risk risk of precancer to include an entire screening cycle. scores and HPV-positive, abnormal cytology cotest results, in a large cohort of women screened in 2016–2017. We developed and initially validated the computer algo- Statistical Analysis rithm based on stored, residual cervical specimens and liquid- We used the likelihood ratio test based on the linear regression based cytology slides from the Persistence and Progression (PaP) model to compare risk score distributions among different study conducted within the KPNC cervical screening program, groups, the likelihood ratio test based on the logistic regression which has used cytology-HPV cotesting since 2003 (11,12). The model to evaluate the association between a risk factor and the NCI-KPNC PaP study was formed by collection, neutralization, case–control status, and the McNemar’s test to compare sensi- and freezing of discarded aliquots of routinely collected cervical tivity and specificity between two given diagnosis rules. All sta- specimens in STM buffer tested for HPV by HC2 (Qiagen, tistical tests were two-sided, and a P value of less than .05 was Germantown, MD). The baseline collection included 54 723 considered statistically significant. women with cotesting in 2007–2011. When women in the PaP cohort returned for later cotest visits, additional discard speci- Development of the Risk Score Algorithm mens were collected and stored. Clinical outcomes of infections were ascertained by merging data from the KPNC cytopathol- We developed a novel risk score algorithm (called “the algo- rithm”) to triage HPV-positive women using a redesign of BD ogy, histopathology, and HPV testing files, complete at the time FocalPoint cytology to target precancer. FocalPoint is a slide of analysis through 2015. To develop the risk score algorithm, we used HPV test speci- scanner that performs a high-speed image capture and outputs mens and the corresponding stored cotest liquid-based cytology 160 binary or quantitative “features” of the cytology slide such slides (SurePath, BD Diagnostics, Sparks, MD) from the subgroup as presence of different cell types, nuclear size, and nuclear of women in the PaP cohort with an HC2-positive baseline or contour (15). The severity rank generated by the commercial al- follow-up visit in 2010. The final analytic set (called the 2010 co- gorithm is currently designed to identify the most innocuous hort) consisted of 1839 subjects. We considered separately the slides in a batch, to reduce and/or guide cytotechnologist ARTICLE Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 ARTICLE 1224 | JNCI J Natl Cancer Inst, 2018, Vol. 110, No. 11 workload. Details of the new algorithm that repurposed Further Validation in the 2016–2017 Cohort FocalPoint optical scanning features to target precancer instead We applied the established algorithm, trained with the 2010 co- are outlined in the Supplementary Methods (available online). hort of 1839 subjects, to FocalPoint output obtained from 243 Briefly, based on the 2010 cohort of 1839 subjects, we 807 slides performed during routine cervical screening at KPNC adopted the least absolute shrinkage and selection operation in late 2016 to mid-2017. We validated the algorithm results (LASSO) model (16) implemented in the R package glmnet (17)to against HPV/cytology cotest results obtained from those same create the risk score predicting the likelihood of the presence or screening visits. We used the Spearman’s rank correlation coef- imminent development of precancer (CIN2/CIN3/AIS) among ficient test for this validation. HPV-positive women. To have an unbiased evaluation of the risk score using the same training set, we employed the leave Results one out cross-validation (LOOCV) procedure (18) to produce an unbiased estimated risk score for each subject for subsequent Analysis of 2010 Cohort analyses of the 2010 cohort. The 2010 cohort consisted of 1839 HPV-positive women, includ- Comparing the Algorithm With Cytology ing the 1529 <CIN2 controls (including women with no referral Cytology can divide HPV-positive women into one of three cate- to colposcopy, or colposcopy/biopsy diagnosis of <CIN2) and gories to reflect the underlying risk for precancer: using 310 cases (181 CIN2 and 129 CIN3/AIS). More detailed baseline Bethesda System terminology, they are normal cytology (NILM), data are given in Supplementary Table 1 (available online). minor HPV-related abnormalities (ASC-US/LSIL), or higher-risk Based on the entire 2010 cohort, the final LASSO model for results (>LSIL, ie, AGC/ASC-H/HSIL/AIS/cancer). Most HPV- the risk score selected 55 features (Supplementary Table 2, based screening programs refer to colposcopy all HPV-positive available online). The distributions of the risk score (unbiased women whose reflex cytology is not normal (ie, ASC-US or estimate by the LOOCV procedure), stratified by cytology and worse). histopathology, are summarized in Supplementary Figure 2 To make a direct comparison between the performances of (available online). the algorithm and cytology in stratifying risk of precancer The cytology results of NILM, ASC-US/LSIL, and >LSIL repre- among HPV-positive women, we chose two cut-points in the au- sented 49%, 45%, and 5.9% of samples in the 2010 cohort. By de- tomated risk score to produce three risk groups (called high, in- sign, the same proportions of subjects were divided into three termediate, and low risk) of exactly same sizes as the ordinary risk groups (low, intermediate, or high) according to their risk cytology strategy of >LSIL, ASC-US/LSIL, NILM. We used the R scores. Table 1 shows the number of control and case subjects package pROC (19) to calculate receiver operating characteristic falling into each group for each strategy, individually and (ROC) curves to show the sensitivity/specificity trade-offs for jointly. The three risk groups defined by each strategy were the risk score. To compare the algorithm with cytology, we con- strongly associated with case–control status (P < .001). Although ducted the McNemar’s test to compare sensitivities for the de- the algorithm and cytology showed good agreement, each gen- tection of precancer achieved by the two strategies at the ASC- erated further risk stratification. For example, after adjusting US or worse threshold (intermediate- or high-risk score), or for the standard cytology strategy, the risk score–based risk >LSIL threshold (high-risk score). groups were still strongly associated with case–control status (P < .001). Figure 1A shows the ROC curve based on the risk score for di- Assessing the Algorithm and Partial HPV Typing as a Triage agnosis of CIN2/CIN3/AIS, with an area under the curve (AUC) of Strategy 0.71 (95% confidence interval [CI] ¼ 0.68 to 0.74). Also shown on We compared absolute risks for precancer in the 12 groups of the figure are the comparisons of sensitivity/specificity at HPV-positive women formed by the combination of four HPV thresholds corresponding to ASC-US and >LSIL. The risk score type groups and three cytology or corresponding risk score corresponding to ASC-US (ie, low- vs intermediate/high-risk groups. Using a novel logistic Cox model (20), we estimated three-year cumulative risks (including the risk for precancers groups) had a sensitivity of 0.75 (95% CI ¼ 0.70 to 0.80) and spe- cificity of 0.54 (95% CI ¼ 0.51 to 0.57), both of which were not sta- present at the time of screening) of CIN3/AIS or CIN2/CIN3/AIS for the 12 strata (Supplementary Methods, available online). We tistically significantly higher than their counterparts based on the standard cytology strategy, which had a sensitivity of 0.71 chose this model because of interval censoring and existence of (95% CI ¼ 0.66 to 0.76) and a specificity of 0.53 (95% CI ¼ 0.50 to prevalent events. To extend the analysis to the whole KPNC screening population, we also estimated precancer risk for HPV- 0.56, P ¼ .16 and .59 for sensitivity and specificity comparisons, respectively). When using the threshold corresponding to >LSIL negative women based on women age 30 years and older from KPNC who tested negative by HC2 in 2012 (n ¼ 239 948). (meant to define a small highest-risk group), the algorithm had In current practice in several countries, HPV-positive women statistically nonsignificantly lower sensitivity/specificity than cytology (P ¼ .08 and .11 for sensitivity and specificity compari- are managed according to a combination of partial HPV typing, when available, and cytology (7–10). Women with the highest- sons, respectively). When cases were restricted to CIN3/AIS (n ¼ 129), with con- risk HPV types (HPV16, HPV18, and sometimes HPV45) and/or abnormal cytology are referred to colposcopy; the remaining trols still being less than CIN2, the AUC of the risk score was ba- HPV-positive women are retested (at one year in the United sically unchanged, increasing slightly from 0.71 to 0.74 (95% CI States). For such a strategy, we compared the triage procedures ¼ 0.69 to 0.78) (Figure 1B). The algorithm showed a marginal (ASC-US or worse cytology vs intermediate-risk score thresh- sensitivity advantage over the corresponding cytology threshold old), estimating the percentage of precancers occurring within >ASC-US (P ¼ .09). the subsequent three years that they each would have referred. We illustrated how the automated screening and triage The P value for this comparison was evaluated through a stan- strategy, which combined HPV testing, partial HPV typing, and dard bootstrap procedure. the algorithm might work in real practice within the total Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 K. Yu et al. | 1225 Table 1. Concordance between the risk groups based on the cytology result and the risk score, in the 2010 cohort of 1839 HPV-positive women, by case–control status* Risk score group Cytology-based risk group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) *The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the 2010 cohort (904, 826, 109). ASC-US ¼ atypical squamous cells of undetermined significance; CIN2 ¼ cervical intraepithelial neoplasia grade II; HPV ¼ human papillomavirus; LSIL ¼ low-grade squamous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. †Number of CIN2 cases (number of <CIN2 controls) occurring within three years of screening tests. Figure 1. The automated risk score receiver operating characteristic (ROC) curve for the detection of cases among human papillomavirus (HPV)–positive women and sensitivity/specificity comparisons between risk groups derived from the automated risk score and conventional cytology result. A) For the detection of cervical intrae- pithelial neoplasia grade II (CIN2)/cervical intraepithelial neoplasia grade III (CIN3)/adenocarcinoma in situ (AIS). B) For the detection of CIN3/AIS. Results were based on the 2010 cohort, with risk score unbiasedly estimated by the leave one out cross-validation (LOOCV) procedure. The ROC curve plots the trade-off of sensitivity and specificity for increasingly elevated risk scores. Both areas under the curve reflected good overall discrimination between 310 CIN2/CIN3/AIS cases (or 190 CIN3/AIS cases) and 1529 controls (<CIN2), with better area under the curve (AUC) when cases were defined more stringently as CIN3/AIS (excluding CIN2). Direct comparison of the conventional cytology strategy with the algorithm strategy was achieved by first dividing risk according to cytology into three groups (negative for intraepithelial le- sion or malignancy, atypical squamous cells of undetermined significance/low-grade squamous intraepithelial lesion [LSIL], and >LSIL, see the Methods for details). The risk scores were divided at two cut-points to create three risk groups of the same sizes as the cytology. The sensitivity and specificity of the two strategies at those two cut-points were compared. AUC ¼ area under the curve; LSIL ¼ low-grade squamous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. screened population including HPV-negative as well as HPV- point of comparison, those with abnormal cytology or interme- positive women, for the more important diagnosis of CIN3/AIS diate/high-risk score also would be referred to colposcopy, while (Table 2) and, secondarily, for CIN2/CIN3/AIS (Supplementary women with cytologic result of NILM (or low-risk score group) Table 3, available online). Details on how those calculations and without HPV16, HPV18, or HPV45 would be retested at one were done are given in Supplementary Methods (available on- year. line). For general screening, approximately 93% of women In the whole screened population, the conventional strategy would fall into the HPV-negative group (as estimated by HC2 would refer 4.4% of screened women for immediate colposcopy; negativity in the full KPNC cohort tested in 2010). Although included in this referred group would be 80.2% of women with HPV-positive women are the focus of this analysis, it is note- prevalent or incident (within three years) CIN3/AIS. The auto- worthy that a single negative HPV test would predict a very low mated strategy would refer a similar proportion of the total pop- absolute risk of precancer in the subsequent years (21), but ulation (4.3%) and catch a slightly higher percentage of CIN3/AIS within this large group, 9.9% of CIN3/AIS would be missed. (82.6%, bootstrap estimated P ¼ .54). Focusing on HPV-positive In triage of the 7% of women testing HPV-positive, the auto- women, the automated strategy would defer 38.4% of them to mated and conventional strategies would both refer to colpos- one-year retesting, while referring the remaining for colpos- copy those women with HPV16, HPV18, or HPV45. As the major copy, catching 91.7% of HPV-positive CIN3/AIS cases (compared ARTICLE Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 ARTICLE 1226 | JNCI J Natl Cancer Inst, 2018, Vol. 110, No. 11 Table 2. Estimated three-year cumulative risk of CIN3/AIS given the HPV typing group and the risk score group/cytology group* Automated strategy Conventional strategy HPV status and %of % cases Cytology %of % cases Action risk group Score level population 3-y risk diagnosed group population 3-y risk diagnosed required HPV-positive 7 0.0724 90.1 7 0.0724 90.1 HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening *The risk score was estimated by a leave-one-out cross-validated LASSO model that integrated FocalPoint scanned features. The three risk score groups (low, interme- diate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the analytic set (49%, 45%, and 6%). The risk was estimated by the logistic Cox model. See the Supplementary Methods (available online) for more details on how estimates in the table were obtained. AIS ¼ adenocarcinoma in situ; ASC-US ¼ atypical squamous cells of undeter- mined significance; CIN3 ¼ cervical intraepithelial neoplasia grade III; HPV ¼ human papillomavirus; LASSO ¼ least absolute shrinkage and selection operator; LSIL ¼ low-grade squamous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. with 37.4% and 89.1%, respectively, for the conventional strat- may also be appealing in the United States and other high- egy). The two strategies also had a comparable performance for resource settings, as an alternative to conventional cytotechnol- the triage of CIN2/CIN3/AIS (Supplementary Table 3, available ogy practice. online). Of note, the algorithm proved equivalent in performance to Characteristics on those 20 cases of invasive cancer that an excellent cytology comparator in deciding which women were excluded from the 2010 cohort are given in the should be referred to colposcopy. During the study period, KPNC Supplementary Methods and Supplementary Table 4 (available employed several strategies to maximize sensitivity of its online). SurePath liquid-based cytology approach, including unmasking of HPV status at the time of screening, use of FocalPoint to rank severity of slides within batches, re-reading of a fraction of 2016–2017 Cohort Analysis slides, and review of discrepant cases. Nonetheless, both cytology and the algorithm remain imper- Using the risk prediction model established with the 2010 co- fect triage methods. The specificity of either method was not hort, we obtained the risk score on the 243 807 women in the high, and the agreement on individual cases was not complete; 2016–2017 validation cohort. Figure 2 shows the distributions each method diagnosed precancer cases missed by the other. of risk scores by cytology result among 22 732 HPV-positive Thus, countries adopting primary HPV testing with triage by women; the association of elevated algorithm scores and severe cytology result was strong and highly statistically significant (P partial typing and cytology should continue to seek improve- < .001, based on Spearman’s rank correlation test). The scores ment. We would hope to improve the algorithm further, tended to be low for all HPV-negative slides except for the very particularly its specificity, by referring fewer women with the rare HPV-negative ASC-H/AGC/HSIL (Supplementary Figure 3, lower-risk HPV types. Also, cytology greater than LSIL outper- available online). formed the algorithm in identification of the women at very highest risk of precancer. Thus, further improvements to the algorithm will include study of what severe features the algo- Discussion rithm is missing that are being detected by human review. Adenocarcinoma is especially difficult to detect by cytology, Cervical screening programs internationally are transitioning to including automated cytology; its importance is increasing in primary HPV testing, with cytology reserved for triage of HPV- well-screened populations, as the burden of squamous cell can- positive women. The results of this study showed that a com- cer is reduced. The range of causal HPV types is limited (mainly puter algorithm matches or exceeds cytology triage perfor- HPV16, HPV18, and HPV45), but cytologic appearance and corre- mance, confirming our previous proof-of-principle study lated automated result can be equivocal or even negative. It will conducted in a referral population (4). Thus, the findings be particularly important to find a way to predict which few strongly support the feasibility of totally automated cervical women with HPV18/45 are at high risk of cancer, given that screening without cytology. Automated cervical screening and triage could be particu- these types do not result in high risk of precancer within the larly appealing for middle-resource settings, permitting exten- first year following detection. The importance of these types as sion of high-quality cervical prevention programs to regions major causes of cancer is manifested over a longer time hori- limited by lack of skilled cytology professionals. Automation zon, reflecting some combination of difficulty of diagnosis of Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 K. Yu et al. | 1227 Figure 2. Distribution of risk scores by cytology results among human papillomavirus (HPV)–positive women from the 2016–2017 validation cohort. Summaries were based on risk scores of HPV-positive women from the 2016–2017 cohort. The risk score was predicted by the risk score model trained with the 2010 cohort. The average scores were equally low among HPV-infected women with negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance, or low-grade squamous intraepithelial lesion. Scores tended to increase when the cytologic result indicated atypical squamous cells rule out high-grade, or atypical glan- dular cells. Elevated scores were observed for women with high-grade squamous intraepithelial lesion (HSIL)/adenocarcinoma in situ (AIS; with relatively very few AIS). Two results of cancer were excluded; both had very high scores. HPV-negative women had uniformly low scores, except for extremely rare HPV-negative HSIL (Supplementary Figure 3, available online). AIS ¼ adenocarcinoma in situ; AGC ¼ atypical glandular cells; ASC-H ¼ atypical squamous cells rule out high-grade; ASC-US ¼ atypical squamous cells of undetermined significance; HPV ¼ human papillomavirus; HSIL ¼ high-grade squamous intraepithelial lesion; LSIL ¼ low-grade squa- mous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. endocervical lesions and the particular biological importance triage, particularly in middle- or even high-income settings. for these types of viral integration prior to progression to pre- Future questions will involve the cost-effectiveness of any auto- cancer/cancer. mated system for middle- and selected low-resource settings, Our study has two main limitations. First, we cross- dissemination and implementation, and finding affordable validated the algorithm on samples from thesamepopulation high-quality triage options for low-resource regions. The ulti- (the 2010 cohort) with a limited sample size. Its performance mate goal remains integration of affordable and high-quality in a different population might be varied. We will retrain and screening/triage with vaccination to promote comprehensive validate the model on the much larger 2016–2017 cohort from cervical cancer control worldwide. KPNC in two or three years when enough precancer cases are accumulated. Second, we converted the continuous risk score Funding into a three-level categorical value for a direct comparison with the conventional cytology reading. We can refine the risk The study was funded/directed by the National Cancer group by classifying the score into more levels (such as five or Institute Intramural Research Program in collaboration with more). But given the limited sample size we had in the 2010 Kaiser Permanente Northern California. BD provided cohort, the resultant risk estimate for each level within a masked human papillomavirus typing. given HPV group might not be reliable. In the future, once we have more histopathologic outcomes from the 2016–2017 co- hort, we can obtain risk estimates for more refined subgroups Notes defined by risk score and HPV typing to achieve better risk Affiliations of authors: Division of Cancer Epidemiology and stratification. Genetics, National Cancer Institute, National Institutes of The important remaining questions are how much the algo- Health, Rockville, MD (KY, NH, HZ, JCG, NW, MS); Kaiser rithm can be further improved by targeting the missed cases Permanente Northern California Regional Laboratory, Berkeley, detected by cytology and whether other automated systems or CA (BF, TL, RES, NEP); Kaiser Permanente Northern California strategies can outperform this one. Other automatable systems Division of Research, Oakland, CA (TRR); Information based on different technologies are also in development. The Management Services, Calverton, MD (WW, BB); Albert Einstein Roche cobas HPV test combined with automated p16-Ki67 dual stain is under development and evaluation (22). A third emerg- College of Medicine, Bronx, NY (PEC). The funders had no role in design of the study; the collec- ing approach that is earlier in development and does not re- tion, analysis, or interpretation of the data; the writing of the quire making a slide is HPV typing combined with viral and host manuscript; or the decision to submit the manuscript for methylation (23,24). It seems likely that one or more of these techniques will permit automation of primary screening and publication. ARTICLE Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 ARTICLE 1228 | JNCI J Natl Cancer Inst, 2018, Vol. 110, No. 11 12. Gage JC, Schiffman M, Katki HA, et al. Reassurance against future risk of pre- References cancer and cancer conferred by a negative human papillomavirus test. J Natl 1. Schiffman M, Doorbar J, Wentzensen N, et al. Carcinogenic human papillo- Cancer Inst. 2014;106(8):dju153. mavirus infection. Nat Rev Dis Primers. 2016;2:16086. 13. Castle PE, Gutierrez EC, Leitch SV, et al. Evaluation of a new DNA test for de- 2. Bosch FX, Robles C, Diaz M, et al. HPV-FASTER: Broadening the scope for pre- tection of carcinogenic human papillomavirus. J Clin Microbiol. 2011;49(8): vention of HPV-related cancer. Nat Rev Clin Oncol. 2016;13(2):119–132. 3029–3032. 3. Vaccarella S, Laversanne M, Ferlay J, et al. Cervical cancer in Africa, Latin 14. Carreon JD, Sherman ME, Guillen D, et al. CIN2 is a much less reproducible America and the Caribbean and Asia: Regional inequalities and changing and less valid diagnosis than CIN3: Results from a histological review of trends. Int J Cancer. 2017;141(10):1997–2001. population-based cervical samples. Int J Gynecol Pathol. 2007;26(4):441–446. 4. Schiffman M, Yu K, Zuna R, et al. Proof-of-principle study of a novel cervical 15. Kardos TF. The FocalPoint System: FocalPoint slide profiler and FocalPoint screening and triage strategy: Computer-analyzed cytology to decide which GS. Cancer. 2004;102(6):334–339. HPV-positive women are likely to have >/¼CIN2. Int J Cancer. 2017;140(3):718–725. 16. Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Statist 5. Ronco G, Dillner J, Elfstrom KM, et al. Efficacy of HPV-based screening for pre- Soc B. 1996;58(1):267–288. vention of invasive cervical cancer: Follow-up of four European randomised 17. Friedman JH, Hastie T, Simon N, et al. Package ‘glmnet.’ https://cran.r- controlled trials. Lancet. 2014;383(9916):524–532. project.org/web/packages/glmnet/glmnet.pdf. Accessed August 1, 2017. 6. Wentzensen N, Schiffman M, Palmer T, et al. Triage of HPV positive women 18. Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data in cervical cancer screening. J Clin Virol. 2016;76(Suppl 1):S49–S55. Mining, Inference, and Prediction. New York: Springer; 2001. 7. Naucler P, Ryd W, Tornberg S, et al. Efficacy of HPV DNA testing with cytology 19. Robin X, Turck N, Hainard A, et al. Package ‘pROC.’ https://cran.r-project.org/ triage and/or repeat HPV DNA testing in primary cervical cancer screening. J web/packages/pROC/pROC.pdf. Assessed August 1, 2017. Natl Cancer Inst. 2009;101(2):88–99. 20. Hyun N, Chenug L, Pan Q, et al. Flexible risk prediction models for left or 8. Huh WK, Ault KA, Chelmow D, et al. Use of primary high-risk human papillo- interval-censored data from electronic health records. Ann Appl Stat. 2017; mavirus testing for cervical cancer screening: Interim clinical guidance. 11(2):1063–1084. Obstet Gynecol. 2015;125(2):330–337. 21. Schiffman M, Kinney WK, Cheung LC, et al. Relative performance of HPV and 9. Luttmer R, De Strooper LM, Berkhof J, et al. Comparing the performance of cytology components of cotesting in cervical screening. J Natl Cancer Inst. FAM19A4 methylation analysis, cytology and HPV16/18 genotyping for the 2018;110(5):501–508. detection of cervical (pre)cancer in high-risk HPV-positive women of a gyneco- 22. Wentzensen N, Fetterman B, Castle PE, et al. p16/Ki-67 dual stain cytology for logic outpatient population (COMETH study). Int J Cancer. 2016;138(4):992–1002. detection of cervical precancer in HPV-positive women. J Natl Cancer Inst. 10. Velentzis LS, Caruana M, Simms KT, et al. How will transitioning from cytol- 2015;107(12):djv257. ogy to HPV testing change the balance between the benefits and harms of 23. Lorincz AT, Brentnall AR, Scibior-Bentkowska D, et al. Validation of a DNA cervical cancer screening? Estimates of the impact on cervical cancer, treat- methylation HPV triage classifier in a screening sample. Int J Cancer. 2016; ment rates and adverse obstetric outcomes in Australia, a high vaccination coverage country. Int J Cancer. 2017;141(12):2410–2422. 138(11):2745–2751. 11. Schiffman M, Hyun N, Raine-Bennett TR, et al. A cohort study of cervical 24. Wentzensen N, Sun C, Ghosh A, et al. Methylation of HPV18, HPV31, and screening using partial HPV typing and cytology triage. Int J Cancer. 2016; HPV45 genomes and cervical intraepithelial neoplasia grade 3. J Natl Cancer 139(11):2606–2615. Inst. 2012;104(22):1738–1749. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png "JNCI: Journal of the National Cancer Institute" Oxford University Press

Loading next page...
 
/lp/ou_press/automated-cervical-screening-and-triage-based-on-hpv-testing-and-0ZgIuf6BW3
Publisher
Oxford University Press
Copyright
Copyright © 2022 Oxford University Press
ISSN
0027-8874
eISSN
1460-2105
DOI
10.1093/jnci/djy044
Publisher site
See Article on Publisher Site

Abstract

Background: State-of-the-art cervical cancer prevention includes human papillomavirus (HPV) vaccination among adolescents and screening/treatment of cervical precancer (CIN3/AIS and, less strictly, CIN2) among adults. HPV testing provides sensitive detection of precancer but, to reduce overtreatment, secondary “triage” is needed to predict women at highest risk. Those with the highest-risk HPV types or abnormal cytology are commonly referred to colposcopy; however, expert cytology services are critically lacking in many regions. Methods: To permit completely automatable cervical screening/triage, we designed and validated a novel triage method, a cytologic risk score algorithm based on computer-scanned liquid-based slide features (FocalPoint, BD, Burlington, NC). We compared it with abnormal cytology in predicting precancer among 1839 women testing HPV positive (HC2, Qiagen, Germantown, MD) in 2010 at Kaiser Permanente Northern California (KPNC). Precancer outcomes were ascertained by record linkage. As additional validation, we compared the algorithm prospectively with cytology results among 243 807 women screened at KPNC (2016–2017). All statistical tests were two-sided. Results: Among HPV-positive women, the algorithm matched the triage performance of abnormal cytology. Combined with HPV16/18/45 typing (Onclarity, BD, Sparks, MD), the automatable strategy referred 91.7% of HPV-positive CIN3/AIS cases to immediate colposcopy while deferring 38.4% of all HPV-positive women to one-year retesting (compared with 89.1% and 37.4%, respectively, for typing and cytology triage). In the 2016–2017 validation, the predicted risk scores strongly correlated with cytology (P < .001). Conclusions: High-quality cervical screening and triage performance is achievable using this completely automated approach. Automated technology could permit extension of high-quality cervical screening/triage coverage to currently un- derserved regions. Cervical cancer is caused by persistent infection with carcino- priority for decades; otherwise, the global burden of cervical genic human papillomaviruses (HPV) (1). Prophylactic HPV vac- cancer will continue to rise (3). It might be possible to expand cination to control transmission is the ultimate prevention the reach of screening programs using automation to overcome strategy; however, vaccination remains a long-term solution the chronic shortage of skilled laboratory professionals (4). due to still-limited coverage and the long latency between infec- The main goal of cervical screening is detection of high- tion and cancer development (2). Improving and expanding cer- grade “precancers” (defined by histopathology of CIN3 and AIS, vical screening with a focus on increasingly vaccinated cohorts or less stringently for maximal protection as CIN2/CIN3/AIS), and lower-resource settings will remain a global prevention which can be treated to prevent cervical cancer mortality and Received: December 5, 2017; Revised: January 17, 2018; Accepted: February 21, 2018 Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US. 1222 Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 K. Yu et al. | 1223 morbidity. Cervical screening programs include two distinct pro- subset of 20 cases with histopathologic outcomes of invasive cedures: screening of the general population and triage of screen- cancer, because we were mainly studying screening to find positive women to focus treatment on precancers. For the gen- treatable precancer. The flowchart for obtaining the final set is eral screening phase, HPV testing for the carcinogenic types of given in Supplementary Figure 1 (available online). HPV is gradually replacing cytology (Pap testing), because HPV To further validate the risk score algorithm, we used a set of testing is more sensitive for detection of precancer with more 243 807 scanned SurePath slides from routine cervical screening sustained negative predictive value (5). It is also more reproduc- at KPNC in late 2016 to mid-2017 (called the 2016–2017 cohort). ible and adaptable to increasingly HPV-vaccinated populations. All specimen use in this study has been approved by both However, a pressing and unsolved problem is how best to the KPNC and National Cancer Institute Institutional Review identify the small minority of 5% to 10% of women (or more) Boards. Use of masked discard specimens was judged not to re- testing positive with HPV screening who are most likely to have quire written informed consent; women could opt out of use of precancer requiring treatments. Most women control/clear their their specimens (8% did). infections and do not need treatment, which involves ablation or excision of the ring of cervical tissue especially susceptible to Partial HPV Typing HPV-induced carcinogenesis (1). It is partly the cost and com- plexity of triage testing needed to avoid overtreatment that is We conducted partial HPV typing of the HC2-positive women impeding widespread adoption of HPV screening programs (6). from 2010 stored aliquots using Onclarity (Becton Dickinson The leading triage strategy in higher-resource regions com- Diagnostics, Sparks, MD), a nine-channel DNA amplification bines partial HPV typing (to identify the highest-risk types) and and detection assay recently approved by the US Food and Drug cytology (used in this case as a second test among HPV-positive Administration for HPV testing and separate identification of women) (7–10). Conventional cytology, which is currently com- HPV16, HPV18, and HPV45 (13). Here we combined the nine HPV bined with HPV testing in most US screening programs to deter- type channel results into four HPV groups according to the ear- mine which women should go to colposcopy, is labor-intensive. lier work (11). The pre-established, hierarchical order from high- Use of computer-interpreted cytology for triage would permit the est to low risk of precancer/cancer was HPV16, else HPV31/33/ automation of the whole screening process. Here we report the 52/58, else HPV18/45, else HPV35/39/51/56/59/66/68. design and evaluation of such a fully automatable cervical screen- ing strategy to determine whether an automated triage algorithm could triage HPV-positive women as accurately as conventionally Histopathology Outcomes interpreted liquid-based SurePath cytology (BD, Burlington, NC). In the analysis of the 2010 cohort, we first ascertained the worst histologic diagnosis made by KPNC pathologists during follow- up from the time of the screening cotest to the end of 2015 (me- Methods dian ¼ 3.7 years, interquartile range ¼ 2.3–4.6 years). We chose CIN2/CIN3/AIS as the measurement of precancer for algorithm Study Design and Participants training, because it represents the treatment threshold in this The study was conducted in two parts, both using data from the age group at KPNC. When evaluating the performance of the Kaiser Permanente Northern California (KPNC) cervical screen- established algorithm, we also considered CIN3/AIS as the more ing program: 1) development and validation of the risk score al- stringent histopathologic diagnoses better representing true gorithm predicting precancer in a cohort analysis of women precancer (14), although the numbers of cases were more lim- cotested with HPV and computer-scanned liquid-based cytology ited. For the absolute risk estimate, we focused on three-year in 2010 and 2) validation of the association between higher risk risk of precancer to include an entire screening cycle. scores and HPV-positive, abnormal cytology cotest results, in a large cohort of women screened in 2016–2017. We developed and initially validated the computer algo- Statistical Analysis rithm based on stored, residual cervical specimens and liquid- We used the likelihood ratio test based on the linear regression based cytology slides from the Persistence and Progression (PaP) model to compare risk score distributions among different study conducted within the KPNC cervical screening program, groups, the likelihood ratio test based on the logistic regression which has used cytology-HPV cotesting since 2003 (11,12). The model to evaluate the association between a risk factor and the NCI-KPNC PaP study was formed by collection, neutralization, case–control status, and the McNemar’s test to compare sensi- and freezing of discarded aliquots of routinely collected cervical tivity and specificity between two given diagnosis rules. All sta- specimens in STM buffer tested for HPV by HC2 (Qiagen, tistical tests were two-sided, and a P value of less than .05 was Germantown, MD). The baseline collection included 54 723 considered statistically significant. women with cotesting in 2007–2011. When women in the PaP cohort returned for later cotest visits, additional discard speci- Development of the Risk Score Algorithm mens were collected and stored. Clinical outcomes of infections were ascertained by merging data from the KPNC cytopathol- We developed a novel risk score algorithm (called “the algo- rithm”) to triage HPV-positive women using a redesign of BD ogy, histopathology, and HPV testing files, complete at the time FocalPoint cytology to target precancer. FocalPoint is a slide of analysis through 2015. To develop the risk score algorithm, we used HPV test speci- scanner that performs a high-speed image capture and outputs mens and the corresponding stored cotest liquid-based cytology 160 binary or quantitative “features” of the cytology slide such slides (SurePath, BD Diagnostics, Sparks, MD) from the subgroup as presence of different cell types, nuclear size, and nuclear of women in the PaP cohort with an HC2-positive baseline or contour (15). The severity rank generated by the commercial al- follow-up visit in 2010. The final analytic set (called the 2010 co- gorithm is currently designed to identify the most innocuous hort) consisted of 1839 subjects. We considered separately the slides in a batch, to reduce and/or guide cytotechnologist ARTICLE Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 ARTICLE 1224 | JNCI J Natl Cancer Inst, 2018, Vol. 110, No. 11 workload. Details of the new algorithm that repurposed Further Validation in the 2016–2017 Cohort FocalPoint optical scanning features to target precancer instead We applied the established algorithm, trained with the 2010 co- are outlined in the Supplementary Methods (available online). hort of 1839 subjects, to FocalPoint output obtained from 243 Briefly, based on the 2010 cohort of 1839 subjects, we 807 slides performed during routine cervical screening at KPNC adopted the least absolute shrinkage and selection operation in late 2016 to mid-2017. We validated the algorithm results (LASSO) model (16) implemented in the R package glmnet (17)to against HPV/cytology cotest results obtained from those same create the risk score predicting the likelihood of the presence or screening visits. We used the Spearman’s rank correlation coef- imminent development of precancer (CIN2/CIN3/AIS) among ficient test for this validation. HPV-positive women. To have an unbiased evaluation of the risk score using the same training set, we employed the leave Results one out cross-validation (LOOCV) procedure (18) to produce an unbiased estimated risk score for each subject for subsequent Analysis of 2010 Cohort analyses of the 2010 cohort. The 2010 cohort consisted of 1839 HPV-positive women, includ- Comparing the Algorithm With Cytology ing the 1529 <CIN2 controls (including women with no referral Cytology can divide HPV-positive women into one of three cate- to colposcopy, or colposcopy/biopsy diagnosis of <CIN2) and gories to reflect the underlying risk for precancer: using 310 cases (181 CIN2 and 129 CIN3/AIS). More detailed baseline Bethesda System terminology, they are normal cytology (NILM), data are given in Supplementary Table 1 (available online). minor HPV-related abnormalities (ASC-US/LSIL), or higher-risk Based on the entire 2010 cohort, the final LASSO model for results (>LSIL, ie, AGC/ASC-H/HSIL/AIS/cancer). Most HPV- the risk score selected 55 features (Supplementary Table 2, based screening programs refer to colposcopy all HPV-positive available online). The distributions of the risk score (unbiased women whose reflex cytology is not normal (ie, ASC-US or estimate by the LOOCV procedure), stratified by cytology and worse). histopathology, are summarized in Supplementary Figure 2 To make a direct comparison between the performances of (available online). the algorithm and cytology in stratifying risk of precancer The cytology results of NILM, ASC-US/LSIL, and >LSIL repre- among HPV-positive women, we chose two cut-points in the au- sented 49%, 45%, and 5.9% of samples in the 2010 cohort. By de- tomated risk score to produce three risk groups (called high, in- sign, the same proportions of subjects were divided into three termediate, and low risk) of exactly same sizes as the ordinary risk groups (low, intermediate, or high) according to their risk cytology strategy of >LSIL, ASC-US/LSIL, NILM. We used the R scores. Table 1 shows the number of control and case subjects package pROC (19) to calculate receiver operating characteristic falling into each group for each strategy, individually and (ROC) curves to show the sensitivity/specificity trade-offs for jointly. The three risk groups defined by each strategy were the risk score. To compare the algorithm with cytology, we con- strongly associated with case–control status (P < .001). Although ducted the McNemar’s test to compare sensitivities for the de- the algorithm and cytology showed good agreement, each gen- tection of precancer achieved by the two strategies at the ASC- erated further risk stratification. For example, after adjusting US or worse threshold (intermediate- or high-risk score), or for the standard cytology strategy, the risk score–based risk >LSIL threshold (high-risk score). groups were still strongly associated with case–control status (P < .001). Figure 1A shows the ROC curve based on the risk score for di- Assessing the Algorithm and Partial HPV Typing as a Triage agnosis of CIN2/CIN3/AIS, with an area under the curve (AUC) of Strategy 0.71 (95% confidence interval [CI] ¼ 0.68 to 0.74). Also shown on We compared absolute risks for precancer in the 12 groups of the figure are the comparisons of sensitivity/specificity at HPV-positive women formed by the combination of four HPV thresholds corresponding to ASC-US and >LSIL. The risk score type groups and three cytology or corresponding risk score corresponding to ASC-US (ie, low- vs intermediate/high-risk groups. Using a novel logistic Cox model (20), we estimated three-year cumulative risks (including the risk for precancers groups) had a sensitivity of 0.75 (95% CI ¼ 0.70 to 0.80) and spe- cificity of 0.54 (95% CI ¼ 0.51 to 0.57), both of which were not sta- present at the time of screening) of CIN3/AIS or CIN2/CIN3/AIS for the 12 strata (Supplementary Methods, available online). We tistically significantly higher than their counterparts based on the standard cytology strategy, which had a sensitivity of 0.71 chose this model because of interval censoring and existence of (95% CI ¼ 0.66 to 0.76) and a specificity of 0.53 (95% CI ¼ 0.50 to prevalent events. To extend the analysis to the whole KPNC screening population, we also estimated precancer risk for HPV- 0.56, P ¼ .16 and .59 for sensitivity and specificity comparisons, respectively). When using the threshold corresponding to >LSIL negative women based on women age 30 years and older from KPNC who tested negative by HC2 in 2012 (n ¼ 239 948). (meant to define a small highest-risk group), the algorithm had In current practice in several countries, HPV-positive women statistically nonsignificantly lower sensitivity/specificity than cytology (P ¼ .08 and .11 for sensitivity and specificity compari- are managed according to a combination of partial HPV typing, when available, and cytology (7–10). Women with the highest- sons, respectively). When cases were restricted to CIN3/AIS (n ¼ 129), with con- risk HPV types (HPV16, HPV18, and sometimes HPV45) and/or abnormal cytology are referred to colposcopy; the remaining trols still being less than CIN2, the AUC of the risk score was ba- HPV-positive women are retested (at one year in the United sically unchanged, increasing slightly from 0.71 to 0.74 (95% CI States). For such a strategy, we compared the triage procedures ¼ 0.69 to 0.78) (Figure 1B). The algorithm showed a marginal (ASC-US or worse cytology vs intermediate-risk score thresh- sensitivity advantage over the corresponding cytology threshold old), estimating the percentage of precancers occurring within >ASC-US (P ¼ .09). the subsequent three years that they each would have referred. We illustrated how the automated screening and triage The P value for this comparison was evaluated through a stan- strategy, which combined HPV testing, partial HPV typing, and dard bootstrap procedure. the algorithm might work in real practice within the total Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 K. Yu et al. | 1225 Table 1. Concordance between the risk groups based on the cytology result and the risk score, in the 2010 cohort of 1839 HPV-positive women, by case–control status* Risk score group Cytology-based risk group High 53 cases (56 controls) Intermediate 181 cases (645 controls) Low 76 cases (828 controls) >LSIL 69 cases (40 controls) 25 (5)† 39 (27) 5 (8) ASC-US/LSIL 151 cases (675 controls) 22 (36) 98 (352) 31 (287) NILM 90 cases (814 controls) 6 (15) 44 (266) 40 (533) *The three risk score groups (low, intermediate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the 2010 cohort (904, 826, 109). ASC-US ¼ atypical squamous cells of undetermined significance; CIN2 ¼ cervical intraepithelial neoplasia grade II; HPV ¼ human papillomavirus; LSIL ¼ low-grade squamous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. †Number of CIN2 cases (number of <CIN2 controls) occurring within three years of screening tests. Figure 1. The automated risk score receiver operating characteristic (ROC) curve for the detection of cases among human papillomavirus (HPV)–positive women and sensitivity/specificity comparisons between risk groups derived from the automated risk score and conventional cytology result. A) For the detection of cervical intrae- pithelial neoplasia grade II (CIN2)/cervical intraepithelial neoplasia grade III (CIN3)/adenocarcinoma in situ (AIS). B) For the detection of CIN3/AIS. Results were based on the 2010 cohort, with risk score unbiasedly estimated by the leave one out cross-validation (LOOCV) procedure. The ROC curve plots the trade-off of sensitivity and specificity for increasingly elevated risk scores. Both areas under the curve reflected good overall discrimination between 310 CIN2/CIN3/AIS cases (or 190 CIN3/AIS cases) and 1529 controls (<CIN2), with better area under the curve (AUC) when cases were defined more stringently as CIN3/AIS (excluding CIN2). Direct comparison of the conventional cytology strategy with the algorithm strategy was achieved by first dividing risk according to cytology into three groups (negative for intraepithelial le- sion or malignancy, atypical squamous cells of undetermined significance/low-grade squamous intraepithelial lesion [LSIL], and >LSIL, see the Methods for details). The risk scores were divided at two cut-points to create three risk groups of the same sizes as the cytology. The sensitivity and specificity of the two strategies at those two cut-points were compared. AUC ¼ area under the curve; LSIL ¼ low-grade squamous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. screened population including HPV-negative as well as HPV- point of comparison, those with abnormal cytology or interme- positive women, for the more important diagnosis of CIN3/AIS diate/high-risk score also would be referred to colposcopy, while (Table 2) and, secondarily, for CIN2/CIN3/AIS (Supplementary women with cytologic result of NILM (or low-risk score group) Table 3, available online). Details on how those calculations and without HPV16, HPV18, or HPV45 would be retested at one were done are given in Supplementary Methods (available on- year. line). For general screening, approximately 93% of women In the whole screened population, the conventional strategy would fall into the HPV-negative group (as estimated by HC2 would refer 4.4% of screened women for immediate colposcopy; negativity in the full KPNC cohort tested in 2010). Although included in this referred group would be 80.2% of women with HPV-positive women are the focus of this analysis, it is note- prevalent or incident (within three years) CIN3/AIS. The auto- worthy that a single negative HPV test would predict a very low mated strategy would refer a similar proportion of the total pop- absolute risk of precancer in the subsequent years (21), but ulation (4.3%) and catch a slightly higher percentage of CIN3/AIS within this large group, 9.9% of CIN3/AIS would be missed. (82.6%, bootstrap estimated P ¼ .54). Focusing on HPV-positive In triage of the 7% of women testing HPV-positive, the auto- women, the automated strategy would defer 38.4% of them to mated and conventional strategies would both refer to colpos- one-year retesting, while referring the remaining for colpos- copy those women with HPV16, HPV18, or HPV45. As the major copy, catching 91.7% of HPV-positive CIN3/AIS cases (compared ARTICLE Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 ARTICLE 1226 | JNCI J Natl Cancer Inst, 2018, Vol. 110, No. 11 Table 2. Estimated three-year cumulative risk of CIN3/AIS given the HPV typing group and the risk score group/cytology group* Automated strategy Conventional strategy HPV status and %of % cases Cytology %of % cases Action risk group Score level population 3-y risk diagnosed group population 3-y risk diagnosed required HPV-positive 7 0.0724 90.1 7 0.0724 90.1 HPV16 High 0.1 0.4117 7.5 >LSIL 0.1 0.5562 12.1 Referral Intermediate 0.5 0.3144 28.4 ASC-US/LSIL 0.5 0.2659 21.0 Referral Low 0.4 0.0967 7.0 NILM 0.5 0.1320 10.3 Referral HPV31/33/52/58 High 0.1 0.4256 7.5 >LSIL 0.1 0.3700 8.1 Referral Intermediate 1.0 0.1081 18.2 ASC-US/LSIL 1.0 0.1031 16.8 Referral Low 1.1 0.0264 4.7 NILM 1.0 0.0298 5.2 Retest HPV18/45 High <0.1 0.2002 1.3 >LSIL <0.1 0.1539 1.3 Referral Intermediate 0.3 0.0622 3.4 ASC-US/LSIL 0.3 0.0456 2.1 Referral Low 0.3 <0.0001 <0.1 NILM 0.4 0.0215 1.3 Referral HPV35/39/51/56/59/66/68 High 0.2 0.0227 0.6 >LSIL 0.1 0.1429 2.6 Referral Intermediate 1.3 0.0389 8.7 ASC-US/LSIL 1.4 0.0197 4.8 Referral Low 1.6 0.010 2.7 NILM 1.6 0.0173 4.6 Retest HPV-negative NA 93 0.0006 9.9 NA 93 0.0006 9.9 Screening *The risk score was estimated by a leave-one-out cross-validated LASSO model that integrated FocalPoint scanned features. The three risk score groups (low, interme- diate, and high) were defined by two chosen cut-points that divided the risk score into three groups of exactly the same sizes as the standard cytology reading groups, defined as NILM, LSIL/ASC-US, and >LSIL, observed in the analytic set (49%, 45%, and 6%). The risk was estimated by the logistic Cox model. See the Supplementary Methods (available online) for more details on how estimates in the table were obtained. AIS ¼ adenocarcinoma in situ; ASC-US ¼ atypical squamous cells of undeter- mined significance; CIN3 ¼ cervical intraepithelial neoplasia grade III; HPV ¼ human papillomavirus; LASSO ¼ least absolute shrinkage and selection operator; LSIL ¼ low-grade squamous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. with 37.4% and 89.1%, respectively, for the conventional strat- may also be appealing in the United States and other high- egy). The two strategies also had a comparable performance for resource settings, as an alternative to conventional cytotechnol- the triage of CIN2/CIN3/AIS (Supplementary Table 3, available ogy practice. online). Of note, the algorithm proved equivalent in performance to Characteristics on those 20 cases of invasive cancer that an excellent cytology comparator in deciding which women were excluded from the 2010 cohort are given in the should be referred to colposcopy. During the study period, KPNC Supplementary Methods and Supplementary Table 4 (available employed several strategies to maximize sensitivity of its online). SurePath liquid-based cytology approach, including unmasking of HPV status at the time of screening, use of FocalPoint to rank severity of slides within batches, re-reading of a fraction of 2016–2017 Cohort Analysis slides, and review of discrepant cases. Nonetheless, both cytology and the algorithm remain imper- Using the risk prediction model established with the 2010 co- fect triage methods. The specificity of either method was not hort, we obtained the risk score on the 243 807 women in the high, and the agreement on individual cases was not complete; 2016–2017 validation cohort. Figure 2 shows the distributions each method diagnosed precancer cases missed by the other. of risk scores by cytology result among 22 732 HPV-positive Thus, countries adopting primary HPV testing with triage by women; the association of elevated algorithm scores and severe cytology result was strong and highly statistically significant (P partial typing and cytology should continue to seek improve- < .001, based on Spearman’s rank correlation test). The scores ment. We would hope to improve the algorithm further, tended to be low for all HPV-negative slides except for the very particularly its specificity, by referring fewer women with the rare HPV-negative ASC-H/AGC/HSIL (Supplementary Figure 3, lower-risk HPV types. Also, cytology greater than LSIL outper- available online). formed the algorithm in identification of the women at very highest risk of precancer. Thus, further improvements to the algorithm will include study of what severe features the algo- Discussion rithm is missing that are being detected by human review. Adenocarcinoma is especially difficult to detect by cytology, Cervical screening programs internationally are transitioning to including automated cytology; its importance is increasing in primary HPV testing, with cytology reserved for triage of HPV- well-screened populations, as the burden of squamous cell can- positive women. The results of this study showed that a com- cer is reduced. The range of causal HPV types is limited (mainly puter algorithm matches or exceeds cytology triage perfor- HPV16, HPV18, and HPV45), but cytologic appearance and corre- mance, confirming our previous proof-of-principle study lated automated result can be equivocal or even negative. It will conducted in a referral population (4). Thus, the findings be particularly important to find a way to predict which few strongly support the feasibility of totally automated cervical women with HPV18/45 are at high risk of cancer, given that screening without cytology. Automated cervical screening and triage could be particu- these types do not result in high risk of precancer within the larly appealing for middle-resource settings, permitting exten- first year following detection. The importance of these types as sion of high-quality cervical prevention programs to regions major causes of cancer is manifested over a longer time hori- limited by lack of skilled cytology professionals. Automation zon, reflecting some combination of difficulty of diagnosis of Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 K. Yu et al. | 1227 Figure 2. Distribution of risk scores by cytology results among human papillomavirus (HPV)–positive women from the 2016–2017 validation cohort. Summaries were based on risk scores of HPV-positive women from the 2016–2017 cohort. The risk score was predicted by the risk score model trained with the 2010 cohort. The average scores were equally low among HPV-infected women with negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance, or low-grade squamous intraepithelial lesion. Scores tended to increase when the cytologic result indicated atypical squamous cells rule out high-grade, or atypical glan- dular cells. Elevated scores were observed for women with high-grade squamous intraepithelial lesion (HSIL)/adenocarcinoma in situ (AIS; with relatively very few AIS). Two results of cancer were excluded; both had very high scores. HPV-negative women had uniformly low scores, except for extremely rare HPV-negative HSIL (Supplementary Figure 3, available online). AIS ¼ adenocarcinoma in situ; AGC ¼ atypical glandular cells; ASC-H ¼ atypical squamous cells rule out high-grade; ASC-US ¼ atypical squamous cells of undetermined significance; HPV ¼ human papillomavirus; HSIL ¼ high-grade squamous intraepithelial lesion; LSIL ¼ low-grade squa- mous intraepithelial lesion; NILM ¼ negative for intraepithelial lesion or malignancy. endocervical lesions and the particular biological importance triage, particularly in middle- or even high-income settings. for these types of viral integration prior to progression to pre- Future questions will involve the cost-effectiveness of any auto- cancer/cancer. mated system for middle- and selected low-resource settings, Our study has two main limitations. First, we cross- dissemination and implementation, and finding affordable validated the algorithm on samples from thesamepopulation high-quality triage options for low-resource regions. The ulti- (the 2010 cohort) with a limited sample size. Its performance mate goal remains integration of affordable and high-quality in a different population might be varied. We will retrain and screening/triage with vaccination to promote comprehensive validate the model on the much larger 2016–2017 cohort from cervical cancer control worldwide. KPNC in two or three years when enough precancer cases are accumulated. Second, we converted the continuous risk score Funding into a three-level categorical value for a direct comparison with the conventional cytology reading. We can refine the risk The study was funded/directed by the National Cancer group by classifying the score into more levels (such as five or Institute Intramural Research Program in collaboration with more). But given the limited sample size we had in the 2010 Kaiser Permanente Northern California. BD provided cohort, the resultant risk estimate for each level within a masked human papillomavirus typing. given HPV group might not be reliable. In the future, once we have more histopathologic outcomes from the 2016–2017 co- hort, we can obtain risk estimates for more refined subgroups Notes defined by risk score and HPV typing to achieve better risk Affiliations of authors: Division of Cancer Epidemiology and stratification. Genetics, National Cancer Institute, National Institutes of The important remaining questions are how much the algo- Health, Rockville, MD (KY, NH, HZ, JCG, NW, MS); Kaiser rithm can be further improved by targeting the missed cases Permanente Northern California Regional Laboratory, Berkeley, detected by cytology and whether other automated systems or CA (BF, TL, RES, NEP); Kaiser Permanente Northern California strategies can outperform this one. Other automatable systems Division of Research, Oakland, CA (TRR); Information based on different technologies are also in development. The Management Services, Calverton, MD (WW, BB); Albert Einstein Roche cobas HPV test combined with automated p16-Ki67 dual stain is under development and evaluation (22). A third emerg- College of Medicine, Bronx, NY (PEC). The funders had no role in design of the study; the collec- ing approach that is earlier in development and does not re- tion, analysis, or interpretation of the data; the writing of the quire making a slide is HPV typing combined with viral and host manuscript; or the decision to submit the manuscript for methylation (23,24). It seems likely that one or more of these techniques will permit automation of primary screening and publication. ARTICLE Downloaded from https://academic.oup.com/jnci/article/110/11/1222/4961333 by DeepDyve user on 18 July 2022 ARTICLE 1228 | JNCI J Natl Cancer Inst, 2018, Vol. 110, No. 11 12. Gage JC, Schiffman M, Katki HA, et al. Reassurance against future risk of pre- References cancer and cancer conferred by a negative human papillomavirus test. J Natl 1. Schiffman M, Doorbar J, Wentzensen N, et al. Carcinogenic human papillo- Cancer Inst. 2014;106(8):dju153. mavirus infection. Nat Rev Dis Primers. 2016;2:16086. 13. Castle PE, Gutierrez EC, Leitch SV, et al. Evaluation of a new DNA test for de- 2. Bosch FX, Robles C, Diaz M, et al. HPV-FASTER: Broadening the scope for pre- tection of carcinogenic human papillomavirus. J Clin Microbiol. 2011;49(8): vention of HPV-related cancer. Nat Rev Clin Oncol. 2016;13(2):119–132. 3029–3032. 3. Vaccarella S, Laversanne M, Ferlay J, et al. Cervical cancer in Africa, Latin 14. Carreon JD, Sherman ME, Guillen D, et al. CIN2 is a much less reproducible America and the Caribbean and Asia: Regional inequalities and changing and less valid diagnosis than CIN3: Results from a histological review of trends. Int J Cancer. 2017;141(10):1997–2001. population-based cervical samples. Int J Gynecol Pathol. 2007;26(4):441–446. 4. Schiffman M, Yu K, Zuna R, et al. Proof-of-principle study of a novel cervical 15. Kardos TF. The FocalPoint System: FocalPoint slide profiler and FocalPoint screening and triage strategy: Computer-analyzed cytology to decide which GS. Cancer. 2004;102(6):334–339. HPV-positive women are likely to have >/¼CIN2. Int J Cancer. 2017;140(3):718–725. 16. Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Statist 5. Ronco G, Dillner J, Elfstrom KM, et al. Efficacy of HPV-based screening for pre- Soc B. 1996;58(1):267–288. vention of invasive cervical cancer: Follow-up of four European randomised 17. Friedman JH, Hastie T, Simon N, et al. Package ‘glmnet.’ https://cran.r- controlled trials. Lancet. 2014;383(9916):524–532. project.org/web/packages/glmnet/glmnet.pdf. Accessed August 1, 2017. 6. Wentzensen N, Schiffman M, Palmer T, et al. Triage of HPV positive women 18. Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data in cervical cancer screening. J Clin Virol. 2016;76(Suppl 1):S49–S55. Mining, Inference, and Prediction. New York: Springer; 2001. 7. Naucler P, Ryd W, Tornberg S, et al. Efficacy of HPV DNA testing with cytology 19. Robin X, Turck N, Hainard A, et al. Package ‘pROC.’ https://cran.r-project.org/ triage and/or repeat HPV DNA testing in primary cervical cancer screening. J web/packages/pROC/pROC.pdf. Assessed August 1, 2017. Natl Cancer Inst. 2009;101(2):88–99. 20. Hyun N, Chenug L, Pan Q, et al. Flexible risk prediction models for left or 8. Huh WK, Ault KA, Chelmow D, et al. Use of primary high-risk human papillo- interval-censored data from electronic health records. Ann Appl Stat. 2017; mavirus testing for cervical cancer screening: Interim clinical guidance. 11(2):1063–1084. Obstet Gynecol. 2015;125(2):330–337. 21. Schiffman M, Kinney WK, Cheung LC, et al. Relative performance of HPV and 9. Luttmer R, De Strooper LM, Berkhof J, et al. Comparing the performance of cytology components of cotesting in cervical screening. J Natl Cancer Inst. FAM19A4 methylation analysis, cytology and HPV16/18 genotyping for the 2018;110(5):501–508. detection of cervical (pre)cancer in high-risk HPV-positive women of a gyneco- 22. Wentzensen N, Fetterman B, Castle PE, et al. p16/Ki-67 dual stain cytology for logic outpatient population (COMETH study). Int J Cancer. 2016;138(4):992–1002. detection of cervical precancer in HPV-positive women. J Natl Cancer Inst. 10. Velentzis LS, Caruana M, Simms KT, et al. How will transitioning from cytol- 2015;107(12):djv257. ogy to HPV testing change the balance between the benefits and harms of 23. Lorincz AT, Brentnall AR, Scibior-Bentkowska D, et al. Validation of a DNA cervical cancer screening? Estimates of the impact on cervical cancer, treat- methylation HPV triage classifier in a screening sample. Int J Cancer. 2016; ment rates and adverse obstetric outcomes in Australia, a high vaccination coverage country. Int J Cancer. 2017;141(12):2410–2422. 138(11):2745–2751. 11. Schiffman M, Hyun N, Raine-Bennett TR, et al. A cohort study of cervical 24. Wentzensen N, Sun C, Ghosh A, et al. Methylation of HPV18, HPV31, and screening using partial HPV typing and cytology triage. Int J Cancer. 2016; HPV45 genomes and cervical intraepithelial neoplasia grade 3. J Natl Cancer 139(11):2606–2615. Inst. 2012;104(22):1738–1749.

Journal

"JNCI: Journal of the National Cancer Institute"Oxford University Press

Published: Nov 1, 2018

Keywords: cytology; human papillomavirus; computers; triage; human papillomavirus test; precancerous conditions; colposcopy

References